python 的numpy库中的mean()函数用法介绍

作者：饕餮争锋时间：2021-12-19 16:22:37　

1. mean() 函数定义：

numpy.mean(a, axis=None, dtype=None, out=None, keepdims=<class numpy._globals._NoValue at 0x40b6a26c>)[source]
Compute the arithmetic mean along the specified axis.

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64intermediate and return values are used for integer inputs.

Parameters:

Parameters:	a : array_like Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted. axis : None or int or tuple of ints, optional Axis or axes along which the means are computed. The default is to compute the mean of the flattened array. New in version 1.7.0. If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before. dtype : data-type, optional Type to use in computing the mean. For integer inputs, the default is `float64`; for floating point inputs, it is the same as the input dtype. out : ndarray, optional Alternate output array in which to place the result. The default is `None`; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See `doc.ufuncs` for details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array. If the default value is passed, then keepdims will not be passed through to the mean method of sub-classes of ndarray, however any non-default value will be. If the sub-classes sum method does not implement keepdims any exceptions will be raised.
Returns:	m : ndarray, see dtype parameter above If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.

a : array_like

Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.

axis : None or int or tuple of ints, optional

Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.
New in version 1.7.0.
If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before.

dtype : data-type, optional

Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.

out : ndarray, optional

Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details.

keepdims : bool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the mean method of sub-classes of ndarray, however any non-default value will be. If the sub-classes sum method does not implement keepdims any exceptions will be raised.

Returns:

m : ndarray, see dtype parameter above

If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.

2 mean()函数功能：求取均值

经常操作的参数为axis，以m * n矩阵举例：

axis 不设置值，对 m*n 个数求均值，返回一个实数

axis = 0：压缩行，对各列求均值，返回 1* n 矩阵

axis =1 ：压缩列，对各行求均值，返回 m *1 矩阵

举例：

>>> import numpy as np

>>> num1 = np.array([[1,2,3],[2,3,4],[3,4,5],[4,5,6]])
>>> now2 = np.mat(num1)
>>> now2
matrix([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])

>>> np.mean(now2) # 对所有元素求均值
3.5

>>> np.mean(now2,0) # 压缩行，对各列求均值
matrix([[ 2.5, 3.5, 4.5]])

>>> np.mean(now2,1) # 压缩列，对各行求均值
matrix([[ 2.],
[ 3.],
[ 4.],
[ 5.]])

补充拓展：numpy的np.nanmax和np.max区别（坑）

numpy的np.nanmax和np.array([1,2,3,np.nan]).max()的区别（坑）

numpy中numpy.nanmax的官方文档

原理

在计算dataframe最大值时，最先用到的一定是Series对象的max()方法()，最终结果是4。

s1 = pd.Series([1,2,3,4,np.nan])
s1_max = s1.max()

但是笔者由于数据量巨大，列数较多，于是为了加快计算速度，采用numpy进行最大值的计算，但正如以下代码，最终结果得到的是nan，而非4。发现，采用这种方式计算最大值，nan也会包含进去，并最终结果为nan。

s1 = pd.Series([1,2,3,4,np.nan])
s1_max = s1.values.max()
>>>nan

通过阅读numpy的文档发现，存在np.nanmax的函数，可以将np.nan排除进行最大值的计算，并得到想要的正确结果。

当然不止是max，min 、std、mean 均会存在列中含有np.nan时，s1.values.min /std/mean ()返回nan的情况。

速度区别

速度由快到慢依次:

s1 = pd.Series([1,2,3,4,5,np.nan])
#速度由快至慢
np.nanmax(s1.values) > np.nanmax(s1) > s1.max()

来源：https://blog.csdn.net/taotiezhengfeng/article/details/72397282

标签：python,numpy,mean

投稿

python 的numpy库中的mean()函数用法介绍

猜你喜欢

Python 3.8正式发布,来尝鲜这些新特性吧

详解python项目实战:模拟登陆CSDN

python调用百度语音REST API

简介Python中用于处理字符串的center()方法

Python eval的常见错误封装及利用原理详解

安装SQL Server 2005时出现计数器错误

Python实现计算长方形面积(带参数函数demo)

IE中雅黑字体给布局带来的变化

ASP.NET 2.0中的数据操作之九：跨页面的主/从报表

详解SQL Server中数据库快照工作原理

python openCV自制绘画板

python3中类的继承以及self和super的区别详解

ASP实现SQL语句日期格式的加减运算

Python和Go成为2019年最受欢迎的黑客工具(推荐)

Ubuntu12下编译安装PHP5.3开发环境

XMLHTTPRequest的属性和方法简介

python使用Qt界面以及逻辑实现方法

通过asp程序来创建access数据库

安装MySQL的步骤和方法

通过5个知识点轻松搞定Python的作用域