1:B 2:C 3:C 4:C 5:CD 6:D 7:C
(请在问题下面的空白框写出代码并执行以输出结果)
speed: 4,4,7,7,8,9
dist: 2,10,4,22,16,10
(1)做speed与dist的散点图,并以此判断speed与dist之间是否大致呈线性关系。
import pandas as pd
data71=pd.read_excel('mydata1.xlsx','7.1')
speed=data71.speed
dist=data71.dist
import matplotlib.pyplot as plt #加载基本绘图包
plt.rcParams['font.sans-serif']=['KaiTi']; #SimHei黑体
plt.rcParams['axes.unicode_minus']=False; #正常显示图中负号
plt.plot(speed,dist,'o');
由散点图判断,speed与dist之间不呈线性关系。
(2)计算speed与dist的相关系数并做假设检验。
speed.corr(dist) #求相关系数,书本第142页
0.41440950942598237
import scipy.stats as st #加载统计方法包
st.pearsonr(speed,dist) #假设检验,书本第143页
(0.4144095094259823, 0.4139700944521663)
第一个为t值,t=0.4144,第二个为p=0.41397>0.05,在a=0.05水平上,不能认为speed与dist有显著相关。
(3)做speed对dist的OLS回归,并给出常用统计量。
import statsmodels.api as sm
fm1=sm.OLS(dist,sm.add_constant(speed)).fit()#书本第146页
fm1.params
C:\Users\Lenovo\anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:142: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only x = pd.concat(x[::order], 1)
const 0.992248 speed 1.488372 dtype: float64
(4)预测当speed=30时,dist等于多少。
import statsmodels.formula.api as smf
fm2=smf.ols('dist~speed',data71).fit()
fm2.summary2().tables[1]
C:\Users\Lenovo\anaconda3\lib\site-packages\statsmodels\stats\stattools.py:74: ValueWarning: omni_normtest is not valid with less than 8 observations; 6 samples were given. warn("omni_normtest is not valid with less than 8 observations; %i "
Coef. | Std.Err. | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 0.992248 | 11.064381 | 0.089679 | 0.932853 | -29.727398 | 31.711895 |
speed | 1.488372 | 1.634317 | 0.910700 | 0.413970 | -3.049220 | 6.025965 |
fm2.predict(pd.DataFrame({'speed':[30]}))#书本第149页
0 45.643411 dtype: float64
所以预测当speed=30时,dist=45.6434。
x:0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.20, 0.21, 0.23
y:42, 43.5, 45, 45.5, 45, 47.5, 49, 53, 50, 55, 55, 60
(1)做x与y的散点图,并以此判断x与y之间是否大致呈线性关系。
data72=pd.read_excel('mydata1.xlsx','7.2')
x72=data72.x
y72=data72.y
plt.plot(x72,y72,'.');
由散点图判断,x与y大致呈线性正相关关系。
(2)计算x与y的相关系数并做假设检验。
x72.corr(y72) #求相关系数
0.9736871624234448
st.pearsonr(x72,y72) #假设检验
(0.9736871624234449, 9.504890245995481e-08)
由于p=9.504890245995478e-08<0.05,在a=0.05水平上,能认为speed与dist有显著相关。
(3)做y对x的最小二乘回归,并给出常用统计量。
fm72=sm.OLS(y72,sm.add_constant(x72)).fit()
fm72.params
C:\Users\Lenovo\anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:142: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only x = pd.concat(x[::order], 1)
const 28.492819 x 130.834829 dtype: float64
fm721=smf.ols('y72~x72',data72).fit() #书本第148页
fm721.summary2().tables[1]
C:\Users\Lenovo\anaconda3\lib\site-packages\scipy\stats\stats.py:1541: UserWarning: kurtosistest only valid for n>=20 ... continuing anyway, n=12 warnings.warn("kurtosistest only valid for n>=20 ... continuing "
Coef. | Std.Err. | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 28.492819 | 1.579806 | 18.035644 | 5.881686e-09 | 24.972792 | 32.012846 |
x72 | 130.834829 | 9.683379 | 13.511278 | 9.504890e-08 | 109.258917 | 152.410742 |
(4)估计当x=0.22时,y等于多少? 预测当x=0.25时,y等于多少?
fm721.predict(pd.DataFrame({'x72':[0.22,0.25]}))
0 57.276481 1 61.201526 dtype: float64
预测当x=0.22时,y=57.2765;预测当x=0.25时,y=61.2015