단순 미래 예측

확률 통계
시계열 분석
공개

2025년 7월 9일

베이스라인 모델 정의

  • 단순예측법: 최근의 자료가 미래에 대한 최선의 추정치 \(\hat{p_{t+1}} = p_t\)
  • 추세분석: 전기와 현기 사이의 추세를 다음 기의 판매예측에 반영하는 방법. \(\hat{p_{t+1}} = p_t + p_t - p_{t-1}\)
  • 단순 이동평균법: time window를 계속 이동하면서 평균 구하는거
    • time window ↑: 먼 과거까지 보겠다
  • 가중 이동평균법: 가중치를 다르게 부여

all 평균 (추세)

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

plt.rcParams['font.family'] = 'Noto Sans KR'
df = pd.read_csv('_data/jj.csv')

train = df[:-4]
test = df[-4:]

historical_mean = np.mean(train['data'])
historical_mean
np.float64(4.308499987499999)
test['pred_mean'] = historical_mean
def mape(y_true, y_pred):
    return np.mean(np.abs(y_true - y_pred) / y_true) * 100

mape_hist_mean = mape(test['data'], test['pred_mean'])
mape_hist_mean
np.float64(70.00752579965119)
sns.lineplot(data=train, x='date', y='data', label='훈련')
sns.lineplot(data=test, x='date', y='data', label='테스트')
sns.lineplot(data=test, x='date', y='pred_mean', label='단순 예측')
plt.xticks(np.arange(0, 85, 8), np.arange(1960, 1981, 2))
([<matplotlib.axis.XTick at 0x7a338eb948c0>,
  <matplotlib.axis.XTick at 0x7a338eb94860>,
  <matplotlib.axis.XTick at 0x7a338eb49850>,
  <matplotlib.axis.XTick at 0x7a338ebc9550>,
  <matplotlib.axis.XTick at 0x7a338ebc9ee0>,
  <matplotlib.axis.XTick at 0x7a338ebc9940>,
  <matplotlib.axis.XTick at 0x7a338ebcac30>,
  <matplotlib.axis.XTick at 0x7a338ebcb5f0>,
  <matplotlib.axis.XTick at 0x7a338ebcbec0>,
  <matplotlib.axis.XTick at 0x7a338ebf88f0>,
  <matplotlib.axis.XTick at 0x7a338ebcb2c0>],
 [Text(0, 0, '1960'),
  Text(8, 0, '1962'),
  Text(16, 0, '1964'),
  Text(24, 0, '1966'),
  Text(32, 0, '1968'),
  Text(40, 0, '1970'),
  Text(48, 0, '1972'),
  Text(56, 0, '1974'),
  Text(64, 0, '1976'),
  Text(72, 0, '1978'),
  Text(80, 0, '1980')])

최근만 평균 (추세)

last_year_mean = np.mean(train.iloc[-4:]['data'])
test['pred_last_yr_mean'] = last_year_mean
mape_last_year_mean = mape(test['data'], test['pred_last_yr_mean'])
mape_last_year_mean
np.float64(15.5963680725103)
sns.lineplot(data=train, x='date', y='data', label='훈련')
sns.lineplot(data=test, x='date', y='data', label='테스트')
sns.lineplot(data=test, x='date', y='pred_last_yr_mean', label='최근 예측')
plt.xticks(np.arange(0, 85, 8), np.arange(1960, 1981, 2))
([<matplotlib.axis.XTick at 0x7a338ebfa4b0>,
  <matplotlib.axis.XTick at 0x7a338eae7860>,
  <matplotlib.axis.XTick at 0x7a338ea21730>,
  <matplotlib.axis.XTick at 0x7a338c92dbe0>,
  <matplotlib.axis.XTick at 0x7a338c96a390>,
  <matplotlib.axis.XTick at 0x7a338c96acf0>,
  <matplotlib.axis.XTick at 0x7a338c96af60>,
  <matplotlib.axis.XTick at 0x7a338c96b920>,
  <matplotlib.axis.XTick at 0x7a338c988350>,
  <matplotlib.axis.XTick at 0x7a338c988d10>,
  <matplotlib.axis.XTick at 0x7a338c9896d0>],
 [Text(0, 0, '1960'),
  Text(8, 0, '1962'),
  Text(16, 0, '1964'),
  Text(24, 0, '1966'),
  Text(32, 0, '1968'),
  Text(40, 0, '1970'),
  Text(48, 0, '1972'),
  Text(56, 0, '1974'),
  Text(64, 0, '1976'),
  Text(72, 0, '1978'),
  Text(80, 0, '1980')])

단순 예측법

last = train.iloc[-1]['data']
test['pred_last'] = last
mape_last = mape(test['data'], test['pred_last'])
mape_last
np.float64(30.457277908606535)
sns.lineplot(data=train, x='date', y='data', label='훈련')
sns.lineplot(data=test, x='date', y='data', label='테스트')
sns.lineplot(data=test, x='date', y='pred_last', label='단순 예측')
plt.xticks(np.arange(0, 85, 8), np.arange(1960, 1981, 2))
([<matplotlib.axis.XTick at 0x7a338ea22180>,
  <matplotlib.axis.XTick at 0x7a338eafc770>,
  <matplotlib.axis.XTick at 0x7a338ea59c40>,
  <matplotlib.axis.XTick at 0x7a338c895be0>,
  <matplotlib.axis.XTick at 0x7a338c8f6150>,
  <matplotlib.axis.XTick at 0x7a338c8f6ab0>,
  <matplotlib.axis.XTick at 0x7a338c8f7440>,
  <matplotlib.axis.XTick at 0x7a338c8f7d70>,
  <matplotlib.axis.XTick at 0x7a338c8f6db0>,
  <matplotlib.axis.XTick at 0x7a338c724500>,
  <matplotlib.axis.XTick at 0x7a338c8f7170>],
 [Text(0, 0, '1960'),
  Text(8, 0, '1962'),
  Text(16, 0, '1964'),
  Text(24, 0, '1966'),
  Text(32, 0, '1968'),
  Text(40, 0, '1970'),
  Text(48, 0, '1972'),
  Text(56, 0, '1974'),
  Text(64, 0, '1976'),
  Text(72, 0, '1978'),
  Text(80, 0, '1980')])

계절적 예측

test['pred_last_season'] = train.iloc[-4:]['data'].values
mape_naive_seasonal = mape(test['data'], test['pred_last_season'])
mape_naive_seasonal
np.float64(11.561658552433654)
맨 위로