如何检查 NaN 值

Question

float('nan')

表示 NaN（不是数字）。但我该如何检查呢？

Answer 1

使用

math.isnan

：

>>> import math
>>> x = float('nan')
>>> math.isnan(x)
True

Answer 2

测试 NaN 的常用方法是查看它是否等于自身：

def isNaN(num):
    return num != num

Answer 3

291
投票

numpy.isnan(number)

告诉您是否为

NaN

。

Answer 4

您可以通过以下三种方法来测试变量是否为“NaN”。

import pandas as pd
import numpy as np
import math

# For single variable all three libraries return single boolean
x1 = float("nan")

print(f"It's pd.isna: {pd.isna(x1)}")
print(f"It's np.isnan: {np.isnan(x1)}}")
print(f"It's math.isnan: {math.isnan(x1)}}")

输出：

It's pd.isna: True
It's np.isnan: True
It's math.isnan: True

Answer 5

编者注：以下计时是有缺陷的，例如，它们没有考虑名称查找时间。看评论。

似乎检查它是否等于自身（

x != x

）是最快的。

import pandas as pd 
import numpy as np 
import math 

x = float('nan')

%timeit x != x
44.8 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit math.isnan(x)
94.2 ns ± 0.955 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit pd.isna(x)
281 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit np.isnan(x)
1.38 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Answer 6

这是与以下人员合作的答案：

遵循 IEEE 754 标准的 NaN 实现
- 即：python 的 NaN：
```
float('nan')
```
  ，
```
numpy.nan
```
  ...
任何其他对象：字符串或其他任何对象（如果遇到则不会引发异常）

按照标准实现的 NaN 是与自身进行不等式比较应返回 True 的唯一值：

def is_nan(x):
    return (x != x)

还有一些例子：

import numpy as np
values = [float('nan'), np.nan, 55, "string", lambda x : x]
for value in values:
    print(f"{repr(value):<8} : {is_nan(value)}")

输出：

nan      : True
nan      : True
55       : False
'string' : False
<function <lambda> at 0x000000000927BF28> : False

Answer 7

我实际上刚刚遇到了这个，但对我来说它是检查 nan、-inf 或 inf。我刚用过

if float('-inf') < float(num) < float('inf'):

对于数字来说这是正确的，对于 nan 和 inf 来说都是错误的，并且对于字符串或其他类型之类的东西会引发异常（这可能是一件好事）。此外，这不需要导入任何库，如 math 或 numpy（numpy 太大了，它会使任何编译的应用程序的大小增加一倍）。

Answer 8

math.isnan()

或将数字与其本身进行比较。 NaN 始终 != NaN，否则（例如，如果它是一个数字）比较应该成功。

Answer 9

我写了这篇文章，因为我对该功能遇到了一些问题：

math.isnan()

运行此代码时出现问题：

a = "hello"
math.isnan(a)

它引发异常。我的解决方案是再次检查：

def is_nan(x):
    return isinstance(x, float) and math.isnan(x)

Answer 10

如果你卡住了，还有另一种方法 <2.6, you don't have numpy, and you don't have IEEE 754 support:

def isNaN(x):
    return str(x) == str(1e400*0)

Answer 11

与蟒蛇< 2.6 I ended up with

def isNaN(x):
    return str(float(x)).lower() == 'nan'

这对我来说适用于 Solaris 5.9 机器上的 python 2.5.1 和 Ubuntu 10 上的 python 2.6.5

Answer 12

比较

pd.isna

、

math.isnan

和

np.isnan

以及它们处理不同类型对象的灵活性。

下表显示是否可以使用给定方法检查对象的类型：


+------------+-----+---------+------+--------+------+
|   Method   | NaN | numeric | None | string | list |
+------------+-----+---------+------+--------+------+
| pd.isna    | yes | yes     | yes  | yes    | yes  |
| math.isnan | yes | yes     | no   | no     | no   |
| np.isnan   | yes | yes     | no   | no     | yes  | <-- # will error on mixed type list
+------------+-----+---------+------+--------+------+

pd.isna

检查不同类型缺失值的最灵活方法。

所有答案都没有涵盖

pd.isna

的灵活性。虽然

math.isnan

和

np.isnan

将为

True

值返回

NaN

，但您无法检查不同类型的对象，例如

None

或字符串。这两种方法都会返回错误，因此检查混合类型的列表会很麻烦。而

pd.isna

很灵活，将为不同类型的类型返回正确的布尔值：

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: missing_values = [3, None, np.NaN, pd.NA, pd.NaT, '10']

In [4]: pd.isna(missing_values)
Out[4]: array([False,  True,  True,  True,  True, False])

Answer 13

我正在从网络服务接收数据，该服务将

NaN

作为字符串

'Nan'

发送。但我的数据中也可能存在其他类型的字符串，因此简单的

float(value)

可能会引发异常。我使用了已接受答案的以下变体：

def isnan(value):
  try:
      import math
      return math.isnan(float(value))
  except:
      return False

要求：

isnan('hello') == False
isnan('NaN') == True
isnan(100) == False
isnan(float('nan')) = True

Answer 14

判断变量是 NaN 还是 None 的所有方法：

无类型

In [1]: from numpy import math

In [2]: a = None
In [3]: not a
Out[3]: True

In [4]: len(a or ()) == 0
Out[4]: True

In [5]: a == None
Out[5]: True

In [6]: a is None
Out[6]: True

In [7]: a != a
Out[7]: False

In [9]: math.isnan(a)
Traceback (most recent call last):
  File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
    math.isnan(a)
TypeError: a float is required

In [10]: len(a) == 0
Traceback (most recent call last):
  File "<ipython-input-10-65b72372873e>", line 1, in <module>
    len(a) == 0
TypeError: object of type 'NoneType' has no len()

NaN型

In [11]: b = float('nan')
In [12]: b
Out[12]: nan

In [13]: not b
Out[13]: False

In [14]: b != b
Out[14]: True

In [15]: math.isnan(b)
Out[15]: True

Answer 15

在 Python 3.6 中检查字符串值 x math.isnan(x) 和 np.isnan(x) 会引发错误。因此，如果我事先不知道给定值是一个数字，我无法检查给定值是否为 NaN。以下似乎可以解决这个问题

if str(x)=='nan' and type(x)!='str':
    print ('NaN')
else:
    print ('non NaN')

Answer 16

如何从混合数据类型列表中删除 NaN（浮点）项

如果迭代器中有混合类型，这里有一个不使用 numpy 的解决方案：

from math import isnan

Z = ['a','b', float('NaN'), 'd', float('1.1024')]

[x for x in Z if not (
                      type(x) == float # let's drop all float values…
                      and isnan(x) # … but only if they are nan
                      )]

['a', 'b', 'd', 1.1024]

短路计算意味着

isnan

不会在非“float”类型的值上调用，因为

False and (…)

可以快速计算为

False

，而无需计算右侧。

Answer 17

对于 float 类型的 nan

>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False

Answer 18

如果您想检查不是 NaN 的值，则对用于标记 NaN 的任何内容取反； pandas 有自己的专用函数来标记非 NaN 值。

lst = [1, 2, float('nan')]

m1 = [e == e for e in lst]              # [True, True, False]

m2 = [not math.isnan(e) for e in lst]   # [True, True, False]

m3 = ~np.isnan(lst)                     # array([ True,  True, False])

m4 = pd.notna(lst)                      # array([ True,  True, False])

如果您想过滤非 NaN 的值，这尤其有用。对于 ndarray/Series 对象，

==

是矢量化的，因此也可以使用。

s = pd.Series(lst)
arr = np.array(lst)

x = s[s.notna()]
y = s[s==s]                             # `==` is vectorized
z = arr[~np.isnan(arr)]                 # array([1., 2.])

assert (x == y).all() and (x == z).all()

Answer 19

要过滤掉“num_specimen_seen”列中的空字符串 (

''

)、None 和

NaN

值，我们可以使用 pandas 中的

pd.notna()

函数。

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'num_specimen_seen': [10, 2, 1, '', 34, 'aw', np.NaN, 5, '43', np.nan, 'ed', None, '']
})

for idx, row in df.iterrows():
    if pd.notna(row['num_specimen_seen']) and row['num_specimen_seen'] != '':
        print(idx, row['num_specimen_seen'])

在迭代 DataFrame 时，此代码将跳过

NaN

和“num_specimen_seen”列中的空字符串。

Answer 20

对于 panda 中的字符串，采用 pd.isnull:

if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):

NLTK 的特征提取功能

def act_features(atext):
features = {}
if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):
    if word not in default_stopwords:
      features['cont({})'.format(word.lower())]=True
return features

如何检查 NaN 值

问题描述投票：0回答：20

20个回答

`pd.isna`

最新问题

如何检查 NaN 值

问题描述 投票：0回答：20

20个回答

pd.isna

最新问题

问题描述投票：0回答：20

`pd.isna`