단순 미래 예측

확률 통계
시계열 분석
공개

2025년 7월 9일

베이스라인 모델 정의

  • 단순예측법: 최근의 자료가 미래에 대한 최선의 추정치 \(\hat{p_{t+1}} = p_t\)
  • 추세분석: 전기와 현기 사이의 추세를 다음 기의 판매예측에 반영하는 방법. \(\hat{p_{t+1}} = p_t + p_t - p_{t-1}\)
  • 단순 이동평균법: time window를 계속 이동하면서 평균 구하는거
    • time window ↑: 먼 과거까지 보겠다
  • 가중 이동평균법: 가중치를 다르게 부여

all 평균 (추세)

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

plt.rcParams['font.family'] = 'Noto Sans KR'
df = pd.read_csv('_data/jj.csv')

train = df[:-4]
test = df[-4:]

historical_mean = np.mean(train['data'])
historical_mean
4.308499987499999
test['pred_mean'] = historical_mean
def mape(y_true, y_pred):
    return np.mean(np.abs(y_true - y_pred) / y_true) * 100

mape_hist_mean = mape(test['data'], test['pred_mean'])
mape_hist_mean
70.00752579965119
sns.lineplot(data=train, x='date', y='data', label='훈련')
sns.lineplot(data=test, x='date', y='data', label='테스트')
sns.lineplot(data=test, x='date', y='pred_mean', label='단순 예측')
plt.xticks(np.arange(0, 85, 8), np.arange(1960, 1981, 2))
([<matplotlib.axis.XTick at 0x7d6386b6cc50>,
  <matplotlib.axis.XTick at 0x7d6386b6c3e0>,
  <matplotlib.axis.XTick at 0x7d63871faa50>,
  <matplotlib.axis.XTick at 0x7d6384124b90>,
  <matplotlib.axis.XTick at 0x7d63841262a0>,
  <matplotlib.axis.XTick at 0x7d6384125610>,
  <matplotlib.axis.XTick at 0x7d6384126b40>,
  <matplotlib.axis.XTick at 0x7d6384127fb0>,
  <matplotlib.axis.XTick at 0x7d63841245f0>,
  <matplotlib.axis.XTick at 0x7d6384124f80>,
  <matplotlib.axis.XTick at 0x7d6384127170>],
 [Text(0, 0, '1960'),
  Text(8, 0, '1962'),
  Text(16, 0, '1964'),
  Text(24, 0, '1966'),
  Text(32, 0, '1968'),
  Text(40, 0, '1970'),
  Text(48, 0, '1972'),
  Text(56, 0, '1974'),
  Text(64, 0, '1976'),
  Text(72, 0, '1978'),
  Text(80, 0, '1980')])

최근만 평균 (추세)

last_year_mean = np.mean(train.iloc[-4:]['data'])
test['pred_last_yr_mean'] = last_year_mean
mape_last_year_mean = mape(test['data'], test['pred_last_yr_mean'])
mape_last_year_mean
15.5963680725103
sns.lineplot(data=train, x='date', y='data', label='훈련')
sns.lineplot(data=test, x='date', y='data', label='테스트')
sns.lineplot(data=test, x='date', y='pred_last_yr_mean', label='최근 예측')
plt.xticks(np.arange(0, 85, 8), np.arange(1960, 1981, 2))
([<matplotlib.axis.XTick at 0x7d6384149a60>,
  <matplotlib.axis.XTick at 0x7d6386b6ee10>,
  <matplotlib.axis.XTick at 0x7d6384147d40>,
  <matplotlib.axis.XTick at 0x7d6386ba21e0>,
  <matplotlib.axis.XTick at 0x7d63d5ec6930>,
  <matplotlib.axis.XTick at 0x7d63d5ec7b00>,
  <matplotlib.axis.XTick at 0x7d63d5ec4620>,
  <matplotlib.axis.XTick at 0x7d63d5ec45f0>,
  <matplotlib.axis.XTick at 0x7d63d5ec7770>,
  <matplotlib.axis.XTick at 0x7d63d5ec7380>,
  <matplotlib.axis.XTick at 0x7d63d5ec59d0>],
 [Text(0, 0, '1960'),
  Text(8, 0, '1962'),
  Text(16, 0, '1964'),
  Text(24, 0, '1966'),
  Text(32, 0, '1968'),
  Text(40, 0, '1970'),
  Text(48, 0, '1972'),
  Text(56, 0, '1974'),
  Text(64, 0, '1976'),
  Text(72, 0, '1978'),
  Text(80, 0, '1980')])

단순 예측법

last = train.iloc[-1]['data']
test['pred_last'] = last
mape_last = mape(test['data'], test['pred_last'])
mape_last
30.457277908606535
sns.lineplot(data=train, x='date', y='data', label='훈련')
sns.lineplot(data=test, x='date', y='data', label='테스트')
sns.lineplot(data=test, x='date', y='pred_last', label='단순 예측')
plt.xticks(np.arange(0, 85, 8), np.arange(1960, 1981, 2))
([<matplotlib.axis.XTick at 0x7d6385e82030>,
  <matplotlib.axis.XTick at 0x7d638413a810>,
  <matplotlib.axis.XTick at 0x7d6385e8f140>,
  <matplotlib.axis.XTick at 0x7d6386ba1ca0>,
  <matplotlib.axis.XTick at 0x7d6385ed0500>,
  <matplotlib.axis.XTick at 0x7d6385ed2ff0>,
  <matplotlib.axis.XTick at 0x7d6385ed0ec0>,
  <matplotlib.axis.XTick at 0x7d6385ed1c10>,
  <matplotlib.axis.XTick at 0x7d6385ceca40>,
  <matplotlib.axis.XTick at 0x7d6385ed3710>,
  <matplotlib.axis.XTick at 0x7d6385ced160>],
 [Text(0, 0, '1960'),
  Text(8, 0, '1962'),
  Text(16, 0, '1964'),
  Text(24, 0, '1966'),
  Text(32, 0, '1968'),
  Text(40, 0, '1970'),
  Text(48, 0, '1972'),
  Text(56, 0, '1974'),
  Text(64, 0, '1976'),
  Text(72, 0, '1978'),
  Text(80, 0, '1980')])

계절적 예측

test['pred_last_season'] = train.iloc[-4:]['data'].values
mape_naive_seasonal = mape(test['data'], test['pred_last_season'])
mape_naive_seasonal
11.561658552433654
맨 위로