Python计算信息熵

信息熵可以用来判定指定信源发出的信息的不确定性,信息越是杂乱无章毫无规律,信息熵就越大。如果某信源总是发出完全一样的信息,那么熵为0,也就是说信息是完全可以确定的。

本文要点在于演示Python字典和内置函数的用法。

from math import log

from random import randint

def informationEntropy(lst):

#数据总个数

num = len(lst)

#每个数据出现的次数

numberofNoRepeat = dict()

for data in lst:

numberofNoRepeat[data] = numberofNoRepeat.get(data,0) + 1

#打印各数据出现次数,以便核对

print(numberofNoRepeat)

#返回信息熵,其中x/num为每个数据出现的频率

return abs(sum(map(lambda x: x/num * log(x/num,2), numberofNoRepeat.values())))

#功能测试

for i in range(10):

lst = [randint(1,5) for i in range(randint(5,30))]

print('Entropy:', informationEntropy(lst))

print('='*20)

print('Entropy:', informationEntropy([1,1,1,1,1,1]))

某次运行结果为:

{1: 4, 2: 3, 3: 9, 4: 3, 5: 8}

Entropy: 2.1608467607817

====================

{1: 3, 2: 1, 3: 5, 4: 2, 5: 7}

Entropy: 2.057924310831006

====================

{1: 5, 2: 3, 3: 2, 4: 1, 5: 2}

Entropy: 2.1339375660949167

====================

{1: 1, 3: 3, 4: 3, 5: 1}

Entropy: 1.8112781244591327

====================

{1: 3, 2: 4, 3: 1, 4: 3, 5: 2}

Entropy: 2.199687794731328

====================

{1: 1, 2: 2, 3: 5, 4: 3, 5: 3}

Entropy: 2.155968102145908

====================

{1: 1, 3: 2, 4: 2, 5: 1}

Entropy: 1.9182958340544893

====================

{1: 1, 2: 2, 4: 2, 5: 1}

Entropy: 1.9182958340544893

====================

{1: 8, 2: 4, 3: 6, 4: 5, 5: 6}

Entropy: 2.284560633641686

====================

{2: 3, 3: 1, 4: 2, 5: 2}

Entropy: 1.9056390622295662

====================

{1: 6}

Entropy: 0.0

相关推荐