Data Model¶
This will discuss the data model for mkname and related topics:
Name Data¶
The core feature of mkname is the selection or generation of
names from a large list of names. The names used by mkname
are stored as mkname.model.Name objects. These objects
store several pieces of information about its name in the following
fields:
For example,For example, let’s say you want to add the name “Graham,” as in the first name of “Graham Chapman” from Monty Python:
>>> name = Name(
... id=0,
... name='Graham',
... source='https://montypython.com',
... culture='MontyPython',
... date=1941,
... gender='python',
... kind='given'
... )
>>> name.name
'Graham'
Name Data Fields¶
The following are the data fields stored for a name in the names database.
id¶
This is a simple serial number used to uniquely identify the
mkname.model.Name object when it is serialized in a
database or other output file.
name¶
This is the name itself as a str object.
The only limitation on this, beyond any set by the str class,
is that it has a maximum size limit of 64 characters. This limit only
exists to provide a boundary for the database. Future versions of
mkname could increase it if there are cultures with names
longer than 64 characters.
source¶
This is the source where the name was found as a str object.
It’s intended to be the specific URL for data the name was pulled from. For example, some of the names in the default database were pulled from the U.S. Census’s list of most common surnames in 2010. The source field for those names is the URL for that report on the U.S. Census website:
https://www.census.gov/topics/population/genealogy/data/2010_surnames.html
There are three main reasons the source data is kept with the name:
It provides context for why the name is in the database.
It credits the people or organization that gathered the name data.
It allows data to be identified and pulled from the database if needed for some reason in the future.
The maximum length of a source is 128 characters.
culture¶
This is the culture the name is from as a str object.
As used in the default database, this is the nation associated with the source I got the name from. However, this is intended to be broader than that. It’s, essentially, any grouping of people you wish to associate the name to. For example, if you were adding the names from the works of J. R. R. Tolkien, you may mark the names of hobbits as “Hobbit” and those of dwarves as “Dwarf.” And, of course, you can split the elven names into “Sindar” and “Quenyan,” and so on.
The purpose of this field is to allow name generation to be narrowed by culture. If you want to generate the name of someone from the Roman Empire, you can limit name generation to just the “Roman” culture.
The maximum length of a culture is 64 characters.
date¶
The date associated with the data the name was taken from as an
int object.
As used in the default database, this is the year for the name in the Common Era (C.E.). Negative values are Before Common Era (B.C.E.).
The date is stored in the SQLite database as an INTEGER. This can be up to an 8 byte, signed number, in case you are projecting names that far into the future or the past.
gender¶
This is the “gender” of the name as a str object.
Name data, especially “given” name data, tends to associate a gender to a name. This gender is tracked in the gender field for the record, so it can be used to filter the names used when generating a name.
The maximum length of a gender is 64 characters.
kind¶
This is the position or function of the name as a str
object.
As used in the default database, there are two kinds of names:
given: The name associated with the individual. In the United States this tends to be the name listed first, i.e. the “first name,” but that’s not true of all cultures.
surname: The name associated with a family. In the United States this tends to be the name listed last, i.e. the “last name,” but that is not true for all cultures.
The maximum length of a kind is 16 characters.
Model API¶
The following is a description of the public API for the data model.
Core Data¶
- class mkname.model.Name(id: IsInt, name: IsStr, source: IsStr, culture: IsStr, date: IsInt, gender: IsStr, kind: IsStr)[source]¶
A name to use for generation.
- Parameters:
id – A unique identifier for the name.
name – The name.
source – The URL where the name was found.
culture – The culture or nation the name is tied to.
date – The approximate year the name is tied to.
gender – The gender typically associated with the name during the time and in the culture the name is from.
kind – A tag for how the name is used, such as a given name or a surname.
- Usage:
>>> id = 1138 >>> name = 'Graham' >>> src = 'Monty Python' >>> culture = 'UK' >>> date = 1941 >>> gender = 'python' >>> kind = 'given' >>> Name(id, name, src, culture, date, gender, kind) Name(id=1138, name='Graham', source='Monty Python'...
Validating Descriptors¶
The following descriptors are used by mkname.model.Name to
validate and normalize data.
- class mkname.model.IsInt(*, default: int = 0)[source]¶
A data descriptor that ensures data is an
int.- Parameters:
default – (Optional.) The default value, if any, of described attribute. Defaults to 0.
- Returns:
A
mkname.model.IsIntobject.- Return type:
- class mkname.model.IsStr(*, default: str = '', size: int = 65595)[source]¶
A data descriptor that ensures data is an
str.- Parameters:
default – (Optional.) The default value, if any, of described attribute. Defaults to an empty
str.size – (Optional.) The maximum length of the value of the described attribute. Defaults to 65,595.
- Returns:
A
mkname.model.IsStrobject.- Return type: