pandas.DataFrame中提取特定类型dtype的列

作者：饺子大人时间：2021-06-13 06:04:25　

pandas.DataFrame为每一列保存一个数据类型dtype。

要仅提取（选择）特定数据类型为dtype的列，请使用pandas.DataFrame的select_dtypes（）方法。

以带有各种数据类型的列的pandas.DataFrame为例。

import pandas as pd

df = pd.DataFrame({'a': [1, 2, 1, 3],
'b': [0.4, 1.1, 0.1, 0.8],
'c': ['X', 'Y', 'X', 'Z'],
'd': [[0, 0], [0, 1], [1, 0], [1, 1]],
'e': [True, True, False, True]})

df['f'] = pd.to_datetime(['2018-01-01', '2018-03-15', '2018-02-20', '2018-03-15'])

print(df)
# a b c d e f
# 0 1 0.4 X [0, 0] True 2018-01-01
# 1 2 1.1 Y [0, 1] True 2018-03-15
# 2 1 0.1 X [1, 0] False 2018-02-20
# 3 3 0.8 Z [1, 1] True 2018-03-15

print(df.dtypes)
# a int64
# b float64
# c object
# d object
# e bool
# f datetime64[ns]
# dtype: object

将描述以下内容。

select_dtypes（）的基本用法

指定要提取的类型：参数include
指定要排除的类型：参数exclude

select_dtypes（）的基本用法

指定要提取的类型：参数include

在参数include中指定要提取的数据类型dtype。

print(df.select_dtypes(include=int))
# a
# 0 1
# 1 2
# 2 1
# 3 3

可以按原样指定作为Python的内置类型提供的那些变量，例如int和float。您可以将“ int”指定为字符串，也可以指定“ int64”（包括确切位数）。（标准位数取决于环境）

print(df.select_dtypes(include='int'))
# a
# 0 1
# 1 2
# 2 1
# 3 3

print(df.select_dtypes(include='int64'))
# a
# 0 1
# 1 2
# 2 1
# 3 3

当然，当最多包括位数时，除非位数匹配，否则不会选择它。

print(df.select_dtypes(include='int32'))
# Empty DataFrame
# Columns: []
# Index: [0, 1, 2, 3]

列表中可以指定多种数据类型dtype。日期和时间datetime64 [ns]可以由’datetime’指定。

print(df.select_dtypes(include=[int, float, 'datetime']))
# a b f
# 0 1 0.4 2018-01-01
# 1 2 1.1 2018-03-15
# 2 1 0.1 2018-02-20
# 3 3 0.8 2018-03-15

可以将数字类型（例如int和float）与特殊值“ number”一起指定。

print(df.select_dtypes(include='number'))
# a b
# 0 1 0.4
# 1 2 1.1
# 2 1 0.1
# 3 3 0.8

元素为字符串str类型的列的数据类型dtype是object，但是object列还包含除str外的Python标准内置类型。实际上，数量并不多，但是，如示例中所示，如果有一列的元素为列表类型，请注意，该列也是由include = object提取的。

print(df.select_dtypes(include=object))
# c d
# 0 X [0, 0]
# 1 Y [0, 1]
# 2 X [1, 0]
# 3 Z [1, 1]

print(type(df.at[0, 'c']))
# <class 'str'>

print(type(df.at[0, 'd']))
# <class 'list'>

但是，除非对其进行有意处理，否则字符串str类型以外的对象都不会（可能）成为pandas.DataFrame的元素，因此不必担心太多。

指定要排除的类型：参数exclude

在参数exclude中指定要排除的数据类型dtype。您还可以在列表中指定多个数据类型dtype。

print(df.select_dtypes(exclude='number'))
# c d e f
# 0 X [0, 0] True 2018-01-01
# 1 Y [0, 1] True 2018-03-15
# 2 X [1, 0] False 2018-02-20
# 3 Z [1, 1] True 2018-03-15

print(df.select_dtypes(exclude=[bool, 'datetime']))
# a b c d
# 0 1 0.4 X [0, 0]
# 1 2 1.1 Y [0, 1]
# 2 1 0.1 X [1, 0]
# 3 3 0.8 Z [1, 1]

可以同时指定包含和排除，但是如果指定相同的类型，则会发生错误。

print(df.select_dtypes(include='number', exclude=int))
# b
# 0 0.4
# 1 1.1
# 2 0.1
# 3 0.8

# print(df.select_dtypes(include=[int, bool], exclude=int))
# ValueError: include and exclude overlap on frozenset({<class 'numpy.int64'>})

来源：https://blog.csdn.net/qq_18351157/article/details/109745683

标签：pandas,DataFrame,特定类型,列

投稿

pandas.DataFrame中提取特定类型dtype的列

select_dtypes（）的基本用法

指定要提取的类型：参数include

指定要排除的类型：参数exclude

猜你喜欢

一些建站常用简单html代码

php SQL防注入代码集合

Django添加KindEditor富文本编辑器的使用

win10下python2和python3共存问题解决方法

Mootools 1.2教程(11)——Fx.Morph、Fx选项和Fx事件

在Python的Tornado框架中实现简单的在线代理的教程

python实现的分析并统计nginx日志数据功能示例

ASP模拟POST提交数据的方法

em和strong的区别

Python语言基础之函数语法

使用字符串建立查询能加快服务器的解析速度吗？

HTML5硝烟弥漫

Python进阶之协程详解

PHP用mysql数据库存储session的代码

css实现图片倒影效果

我喜欢你抖音表白程序python版

网页广告 Banner 设计图文手册

python爬虫的一个常见简单js反爬详解

HTML5设计原则

Python使用Asyncio进行web编程方法详解

pandas.DataFrame中提取特定类型dtype的列

select_dtypes（）的基本用法

指定要提取的类型：参数include

指定要排除的类型：参数exclude

猜你喜欢

一些建站常用简单html代码

php SQL防注入代码集合

Django添加KindEditor富文本编辑器的使用

win10下python2和python3共存问题解决方法

Mootools 1.2教程(11)——Fx.Morph、Fx选项和Fx事件

在Python的Tornado框架中实现简单的在线代理的教程

python实现的分析并统计nginx日志数据功能示例

ASP模拟POST提交数据的方法

em和strong的区别

Python语言基础之函数语法

使用字符串建立查询能加快服务器的解析速度吗？

HTML5硝烟弥漫

Python进阶之协程详解

PHP用mysql数据库存储session的代码

css实现图片倒影效果

我喜欢你 抖音表白程序python版

网页广告 Banner 设计图文手册

python爬虫的一个常见简单js反爬详解

HTML5设计原则

Python使用Asyncio进行web编程方法详解

我喜欢你抖音表白程序python版