Test-passed： we recover the equivalent subgroup dataframe.

admin 2024-02-11 36 阅读 0 评论

　　唐宇迪《python数据分析与机器学习实战》学习笔记

　　32探索性数据分析-足球赛事数据集

　　原始数据：链接，提取码：yypl

　　数据包含球员和裁判的信息，2012-2013年的比赛数据，总共设计球员2053名，裁判3147名，特征列表如下：

　　

　　1.1 数据及模块导入

　　(146028, 28)

　　

　　1.2 简单的统计：(count统计非空值个数)

　　

　　1.3查看数据类型：

　　(机器学习建模时只认识‘float’和‘int’型，其他类型需要映射转换一下，这里做探索分析就不用了)

　　playerShort object

　　player object

　　club object

　　leagueCountry object

　　birthday object

　　height float64

　　weight float64

　　position object

　　games int64

　　victories int64

　　ties int64

　　defeats int64

　　goals int64

　　yellowCards int64

　　yellowReds int64

　　redCards int64

　　photoID object

　　rater1 float64

　　rater2 float64

　　refNum int64

　　refCountry int64

　　Alpha_3 object

　　meanIAT float64

　　nIAT float64

　　seIAT float64

　　meanExp float64

　　nExp float64

　　seExp float64

　　dtype: object

　　1.4 查看并提取列名

　　[‘playerShort’,

　　‘player’,

　　‘club’,

　　‘leagueCountry’,

　　‘birthday’,

　　‘height’,

　　‘weight’,

　　‘position’,

　　‘games’,

　　‘victories’,

　　‘ties’,

　　‘defeats’,

　　‘goals’,

　　‘yellowCards’,

　　‘yellowReds’,

　　‘redCards’,

　　‘photoID’,

　　‘rater1’,

　　‘rater2’,

　　‘refNum’,

　　‘refCountry’,

　　‘Alpha_3’,

　　‘meanIAT’,

　　‘nIAT’,

　　‘seIAT’,

　　‘meanExp’,

　　‘nExp’,

　　‘seExp’]

　　思考问题，加入一个运动员出现多次，计算时相当于其权重加强，所以可以用groupby解决这个问题：

　　181.93593798236887

　　181.74372848007872

　　数据通常具有多特征高纬度，分析时统计指标不同，因此可以将其分为几个小的数据集单项分析。例如：单看球员、裁判，看球员-裁判关系，单看国家…

　　2.1.1数据切分

　　2.1.2检测及去重

　　

　　

　　这里直接写了一个检测去重函数，主要是看key值重复没

　　函数调用

　　

　　数据干净后就储存，这里增加储存函数：

　　Test-passed: we recover the equivalent subgroup dataframe. 储存成功

　　

　　根据上面一套操作思路还可以切割其他数据：

　　club leagueCountry

FC Nürnberg Germany

FSV Mainz 05 Germany

　　1899 Hoffenheim Germany

　　AC Ajaccio France

　　AFC Bournemouth England

　　England 48

　　Spain 27

　　France 22

　　Germany 21

　　Name: leagueCountry, dtype: int64

　　Test-passed: we recover the equivalent subgroup dataframe.

Test-passed： we recover the equivalent subgroup dataframe.

The End

文章声明：以上内容(如有图片或视频在内)除非注明，否则均为雨燕体育直播_雨燕无插件体育直播_雨燕直播体育_雨燕体育直播nba原创文章，转载或复制请以超链接形式并注明出处。

本文作者：admin本文链接：https://123wssy.com/post/888.html

上一篇凭借着他雄厚的实力最终拿到了公开组的冠军下一篇下午播放的电视剧是《流金岁月》的13-17集

发表评论取消回复

评论列表（暂无评论，36人围观）

还没有评论，来说两句吧...

微信二维码

微信二维码

支付宝二维码