Python数据处理之pd.Series()函数的基本使用

作者:流年里不舍的执着 时间:2022-09-29 08:50:21 

1.Series介绍

Pandas模块的数据结构主要有两种:1.Series 2.DataFrame

Series 是一维数组,基于Numpy的ndarray 结构

Series([data, index, dtype, name, copy, …])    
# One-dimensional ndarray with axis labels (including time series).

2.Series创建

import Pandas as pd
import numpy as np

1.pd.Series([list],index=[list])

参数为list ,index为可选参数,若不填写则默认为index从0开始

obj = pd.Series([4, 7, -5, 3, 7, np.nan])
obj

输出结果为:

0    4.0
1    7.0
2   -5.0
3    3.0
4    7.0
5    NaN
dtype: float64

2.pd.Series(np.arange())

arr = np.arange(6)
s = pd.Series(arr)
s

输出结果为:

0    0
1    1
2    2
3    3
4    4
5    5
dtype: int32

pd.Series({dict})
d = {'a':10,'b':20,'c':30,'d':40,'e':50}
s = pd.Series(d)
s

输出结果为:

a    10
b    20
c    30
d    40
e    50
dtype: int64

可以通过DataFrame中某一行或者某一列创建序列

3 Series基本属性

  • Series.values:Return Series as ndarray or ndarray-like depending on the dtype

obj.values
# array([ 4.,  7., -5.,  3.,  7., nan])
  • Series.index:The index (axis labels) of the Series.

obj.index
# RangeIndex(start=0, stop=6, step=1)
  • Series.name:Return name of the Series.

4 索引

  • Series.loc:Access a group of rows and columns by label(s) or a boolean array.

  • Series.iloc:Purely integer-location based indexing for selection by position.

5 计算、描述性统计

 Series.value_counts:Return a Series containing counts of unique values.

index = ['Bob', 'Steve', 'Jeff', 'Ryan', 'Jeff', 'Ryan']
obj = pd.Series([4, 7, -5, 3, 7, np.nan],index = index)
obj.value_counts()

输出结果为:

 7.0    2
 3.0    1
-5.0    1
 4.0    1
dtype: int64

6 排序

Series.sort_values

Series.sort_values(self, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

Parameters:

ParametersDescription
axis{0 or ‘index’}, default 0,Axis to direct sorting. The value ‘index’ is accepted for compatibility with DataFrame.sort_values.
ascendinbool, default True,If True, sort values in ascending order, otherwise descending.
inplacebool, default FalseIf True, perform operation in-place.
kind{‘quicksort’, ‘mergesort’ or ‘heapsort’}, default ‘quicksort’Choice of sorting algorithm. See also numpy.sort() for more information. ‘mergesort’ is the only stable algorithm.
na_position{‘first’ or ‘last’}, default ‘last’,Argument ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end.

Returns:

Series:Series ordered by values.

obj.sort_values()

输出结果为:

Jeff    -5.0
Ryan     3.0
Bob      4.0
Steve    7.0
Jeff     7.0
Ryan     NaN
dtype: float64

  • Series.rank

Series.rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)[source]

Parameters:

ParametersDescription
axis{0 or ‘index’, 1 or ‘columns’}, default 0Index to direct ranking.
method{‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’How to rank the group of records that have the same value (i.e. ties): average, average rank of the group; min: lowest rank in the group; max: highest rank in the group; first: ranks assigned in order they appear in the array; dense: like ‘min’, but rank always increases by 1,between groups
numeric_onlybool, optional,For DataFrame objects, rank only numeric columns if set to True.
na_option{‘keep’, ‘top’, ‘bottom’}, default ‘keep’, How to rank NaN values:;keep: assign NaN rank to NaN values; top: assign smallest rank to NaN values if ascending; bottom: assign highest rank to NaN values if ascending
ascendingbool, default True Whether or not the elements should be ranked in ascending order.
pctbool, default False Whether or not to display the returned rankings in percentile form.

Returns:

same type as caller :Return a Series or DataFrame with data ranks as values.

# obj.rank()            #从大到小排,NaN还是NaN
obj.rank(method='dense')  
# obj.rank(method='min')
# obj.rank(method='max')
# obj.rank(method='first')
# obj.rank(method='dense')

输出结果为:

Bob      3.0
Steve    4.0
Jeff     1.0
Ryan     2.0
Jeff     4.0
Ryan     NaN
dtype: float64

来源:https://blog.csdn.net/weixin_43868107/article/details/102631717

标签:python,pd.series,pandas
0
投稿

猜你喜欢

  • 解决启动django,浏览器显示“服务器拒绝访问”的问题

    2023-08-20 05:45:01
  • Python 爬虫的原理

    2023-01-18 21:05:35
  • MySQL性能优化技巧分享

    2024-01-26 06:58:16
  • js判断输入字符串是否为空、空格、null的方法总结

    2024-04-19 09:56:56
  • pandas 把数据写入txt文件每行固定写入一定数量的值方法

    2021-06-13 20:08:14
  • 使用php数据缓存技术提高执行效率

    2023-05-24 23:14:24
  • 使用Python自动生成HTML的方法示例

    2022-08-12 12:12:18
  • asp经典入门教程 在ASP中使用SQL 语句

    2013-06-01 20:23:21
  • 利用 PyCharm 实现本地代码和远端的实时同步功能

    2022-03-05 08:54:10
  • php引用和拷贝的区别知识点总结

    2023-11-15 03:39:48
  • js文本框输入内容智能提示效果

    2024-04-22 13:01:32
  • 使用idea创建vue项目的图文教程

    2024-05-22 10:42:54
  • 详解Python的条件语句

    2021-03-04 08:27:56
  • Python科学画图代码分享

    2023-08-19 07:06:25
  • 详解.NET数据库连接池

    2024-01-20 16:05:59
  • Go语言实现JSON解析的神器详解

    2024-05-28 15:23:17
  • Python Flask微信小程序登录流程及登录api实现代码

    2022-03-21 14:33:47
  • 基于Python编写一个简单的服务注册发现服务器

    2022-06-11 20:23:31
  • Python生成pdf目录书签的实例方法

    2023-10-11 15:43:21
  • MySQL锁(表锁,行锁,共享锁,排它锁,间隙锁)使用详解

    2024-01-20 12:07:17
  • asp之家 网络编程 m.aspxhome.com