关于Pandas count()与values_count()的用法及区别
作者:Elvirangel 时间:2021-09-25 08:28:20
Pandas count()与values_count()用法
count()
values_count()在指定的统计的列名上
结果多了该列:
对比:
对比:
Pandas:count()与value_counts()对比
1. Series.value_counts(self, normalize=False, sort=True, ascending=False, bins=None, dropna=True)
返回一个包含所有值及其数量的 Series。 且为降序输出,即数量最多的第一行输出。
参数含义如下:
Parameters: | normalize : boolean, default False If True then the object returned will contain the relative frequencies of the unique values. sort : boolean, default True Sort by frequencies. ascending : boolean, default False Sort in ascending order. bins : integer, optional Rather than count values, group them into half-open bins, a convenience for pd.cut, only works with numeric data. dropna : boolean, default True Don’t include counts of NaN. |
---|---|
Returns: | Series |
举例如下:
import pandas as pd
index = pd.Index([3, 1, 2, 3, 4, np.nan])
index.value_counts()
"""
输出为:
3.0 2
4.0 1
2.0 1
1.0 1
dtype: int64
"""
如果 normalize 为 True的话,统计的结果会相加 = 1:
import pandas as pd
s = pd.Series([3, 1, 2, 3, 4, np.nan])
s.value_counts(normalize=True)
"""
输出为:
3.0 0.4
4.0 0.2
2.0 0.2
1.0 0.2
dtype: float64
"""
2. Series.count(self, level=None)
返回非空值的数量。若是在 CSV 文件中可用来统计行数,如:
import pandas as pd
file = pd.read_csv('test.csv')
print(file['A'].count())
# 此时输出的即是 A 列的行数
参数含义如下:
Parameters: | level : int or level name, default None If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a smaller Series. |
---|---|
Returns: | int or Series (if level specified) Number of non-null values in the Series. |
举例如下:
import pands as pd
s = pd.Series([0.0, 1.0, np.nan])
s.count()
# 此时输出为 2
这就是两者的区别和各自的用途。
来源:https://blog.csdn.net/Elvirangel/article/details/104556394