Fake data

当测试程序需要数据时,可以通过faker来构造测试数据

安装

1
$ pip install faker

基本使用

1
2
3
4
5
6
7
8
9
10
11
12
from faker import Faker
faker = Faker()

faker.name()
# 'Lucy Cechtelar'

faker.address()
# '426 Jordy Lodge
# Cartwrightshire, SC 88120-6700'

faker.ipv4()
# '196.67.103.129'

对方法 faker.ipv4()的每次调用都会产生不同的随机结果

1
2
3
4
5
6
7
8
9
10
11
12
13
for _ in range(10):
print(faker.ipv4())

# '120.36.235.152'
# '16.58.6.69'
# '170.215.56.41'
# '135.217.158.192'
# '218.235.87.38'
# '175.80.75.73'
# '122.120.1.128'
# '99.91.143.38'
# '1.90.129.142'
# '184.148.193.249'

提供的providers

创建自定义的provider

对于一个枚举类型,可以创建自定义的provider

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from enum import Enum
from faker import Faker
from faker.providers import DynamicProvider

class Country(Enum):
CHINA = "china"
AMERICA = "america"

country_provider = DynamicProvider(
provider_name="country",
elements=Country,
)

faker = Faker()
faker.add_provider(country_provider)

faker.country()
# <Country.CHINA: 'chian'>

构造数据

可以根据不同的列名,获取对应的provider_name,然后通过getattr(faker, provider_name)()调用后获取数据

例如

  • 如果列名为src_ip或dst_ip,都调用ipv4()这个provider
  • 如果列名为created_at或updated_at,都调用data_time_this_year()这个provider
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
fake_provider_alias = {
"phone_number": ["tel", "phone_number"],
"date_time_this_year": ["timestamp", "created_at", "updated_at"],
"ipv4": ["src_ip", "dest_ip", "dst_ip"],
"random_int": ["flow_id"],
"country": ["country"],
"name": ["name"]
}

def _get_provider_name(col_name):
for provider, aliases in fake_provider_alias.items():
if col_name in aliases:
return provider

return "name"


for _ in range(100):
Country(
**{
f.name: getattr(faker, _get_provider_name(f.name))()
for f in Person._meta.get_fields()
if f.name != "id"
}
).save()