Same data from different entities in Database - Best Practice - Phone numbers example
考虑到一个处理员工、客户和供应商的数据库系统,所有这些客户和供应商都有多个可能的电话号码,那么如何以一种良好的标准化方式存储这些号码呢?我有一点想了想,合乎逻辑的方法不是冲我跳出来。
在大多数情况下。…
- "员工"总是描述人。
- 有些客户是人。
- 有些客户是企业(组织)。
- "供应商"通常(总是?)组织。
- 员工也可以是客户。
- 供应商也可以是客户。
员工电话号码、供应商电话号码和客户电话号码的单独表存在严重问题。
- 员工可以是客户。如果员工电话号码变了,是不是客户电话号码也需要更新?你怎么知道要更新哪一个?
- 供应商可以是客户。如果A供应商的电话号码发生了变化,客户是否电话号码也需要更新?你怎么知道要更新哪一个?
- 你必须正确地复制和维护约束每个表中的电话号码存储电话号码。
- 当客户的电话号码更改。现在你得检查一下员工和供应商电话号码也需要更新。
- 回答"谁的电话号码是123-456-7890?",你必须看看不同的桌子,在哪里"n"是不同的数字"种类"的聚会。在增加员工、客户和供应商,认为"承包商的电话、"潜在客户的电话"等。
您需要实现一个父类型/子类型模式。(PostgreSQL代码,未经过严格测试。)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | create table parties ( party_id integer not null unique, party_type char(1) check (party_type in ('I', 'O')), party_name varchar(10) not null unique, primary key (party_id, party_type) ); insert into parties values (1,'I', 'Mike'); insert into parties values (2,'I', 'Sherry'); insert into parties values (3,'O', 'Vandelay'); -- For"persons", a subtype of"parties" create table person_st ( party_id integer not null unique, party_type char(1) not null default 'I' check (party_type = 'I'), height_inches integer not null check (height_inches between 24 and 108), primary key (party_id), foreign key (party_id, party_type) references parties (party_id, party_type) on delete cascade ); insert into person_st values (1, 'I', 72); insert into person_st values (2, 'I', 60); -- For"organizations", a subtype of"parties" create table organization_st ( party_id integer not null unique, party_type CHAR(1) not null default 'O' check (party_type = 'O'), ein CHAR(10), -- In US, federal Employer Identification Number primary key (party_id), foreign key (party_id, party_type) references parties (party_id, party_type) on delete cascade ); insert into organization_st values (3, 'O', '00-0000000'); create table phones ( party_id integer references parties (party_id) on delete cascade, -- Whatever you prefer to distinguish one kind of phone usage from another. -- I'll just use a simple 'phone_type' here, for work, home, emergency, -- business, and mobile. phone_type char(1) not null default 'w' check (phone_type in ('w', 'h', 'e', 'b', 'm')), -- Phone numbers in the USA are 10 chars. YMMV. phone_number char(10) not null check (phone_number ~ '[0-9]{10}'), primary key (party_id, phone_type) ); insert into phones values (1, 'h', '0000000000'); insert into phones values (1, 'm', '0000000001'); insert into phones values (3, 'h', '0000000002'); -- Do what you need to do on your platform--triggers, rules, whatever--to make -- these views updatable. Client code uses the views, not the base tables. -- In current versions of PostgreSQL, I think you'd create some"instead -- of" rules. -- create view people as select t1.party_id, t1.party_name, t2.height_inches from parties t1 inner join person_st t2 on (t1.party_id = t2.party_id); create view organizations as select t1.party_id, t1.party_name, t2.ein from parties t1 inner join organization_st t2 on (t1.party_id = t2.party_id); create view phone_book as select t1.party_id, t1.party_name, t2.phone_type, t2.phone_number from parties t1 inner join phones t2 on (t1.party_id = t2.party_id); |
为了进一步扩展这个概念,实现"staff"的表需要引用person子类型,而不是party父类型。组织不能是员工。
1 2 3 4 5 | create table staff ( party_id integer primary key references person_st (party_id) on delete cascade, employee_number char(10) not null unique, first_hire_date date not null default CURRENT_DATE ); |
号
如果供应商只能是组织而不是个人,那么实现供应商的表将以类似的方式引用组织子类型。
对于大多数公司,客户可以是个人或组织,因此实现客户的表应该引用父类型。
1 2 3 4 | create table customers ( party_id integer primary key references parties (party_id) on delete cascade -- Other attributes of customers ); |
MikeSherrill的"猫召回"的答案适用于Mariadb,只有一个变化:"~"需要变成"像"。
这是他在Mariadb上测试的例子。在这里,我还对使用单词而不是单个字符描述的类型做了要求的更改。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | create table parties ( party_id integer not null unique, party_type varchar(20) not null check (party_type in ('individual', 'organization')), party_name varchar(50) not null unique, primary key (party_id, party_type) ); insert into parties values (1,'individual', 'Mike'); insert into parties values (2,'individual', 'Sherry'); insert into parties values (3,'organization', 'Vandelay'); -- For"persons", a subtype of"parties" create table person_st ( party_id integer not null unique, party_type varchar(20) not null default 'individual' check (party_type = 'individual'), height_inches integer not null check (height_inches between 24 and 108), primary key (party_id), foreign key (party_id, party_type) references parties (party_id, party_type) on delete cascade ); insert into person_st values (1, 'individual', 72); insert into person_st values (2, 'individual', 60); -- For"organizations", a subtype of"parties" create table organization_st ( party_id integer not null unique, party_type varchar(20) not null default 'organization' check (party_type = 'organization'), ein CHAR(10), -- In US, federal Employer Identification Number primary key (party_id), foreign key (party_id, party_type) references parties (party_id, party_type) on delete cascade ); insert into organization_st values (3, 'organization', '00-0000000'); create table phones ( party_id integer references parties (party_id) on delete cascade, -- Whatever you prefer to distinguish one kind of phone usage from another. -- I'll just use a simple 'phone_type' here, for work, home, emergency, -- business, and mobile. phone_type varchar(10) not null default 'work' check (phone_type in ('work', 'home', 'emergency', 'business', 'mobile')), -- Phone numbers in the USA are 10 chars. YMMV. phone_number char(10) not null check (phone_number like '[0-9]{10}'), primary key (party_id, phone_type) ); insert into phones values (1, 'home', '0000000000'); insert into phones values (1, 'mobile', '0000000001'); insert into phones values (3, 'home', '0000000002'); -- Do what you need to do on your platform--triggers, rules, whatever--to make -- these views updatable. Client code uses the views, not the base tables. -- Inserting and Updating with Views - MariaDB Knowledge Base https://mariadb.com/kb/en/library/inserting-and-updating-with-views/ -- create view people as select t1.party_id, t1.party_name, t2.height_inches from parties t1 inner join person_st t2 on (t1.party_id = t2.party_id); create view organizations as select t1.party_id, t1.party_name, t2.ein from parties t1 inner join organization_st t2 on (t1.party_id = t2.party_id); create view phone_book as select t1.party_id, t1.party_name, t2.phone_type, t2.phone_number from parties t1 inner join phones t2 on (t1.party_id = t2.party_id); |
我认为,这一决定需要以实际评估为基础,评估这种联系信息的重要性、变化的频率以及不同类型的有电话号码的人之间可能存在多少重叠。
如果联系信息是不稳定的和/或对应用程序非常重要,那么更多的规范化可能会更好。这意味着,您的各种客户、供应商、员工表(等等)都可以指向一个电话号码表,或者更可能被联系人类型、联系人个人(客户/供应商/员工)和联系人点(电话)之间的某种三向交叉引用。这样,您就可以让员工的家庭电话号码作为其客户记录的主要业务号码,如果更改了,则该联系点的每次使用都会更改一次。
另一方面,如果你存储电话号码是为了检查它,而你不使用它们,可能也不会维护它们,那么花费大量的时间和精力建模,并在数据库中建立这种复杂的功能是不值得的,你可以做的好,老式的电话1,电话2,电话3,…有关客户、供应商、员工或您拥有的内容的列。这是一种糟糕的数据库设计,但在应用80/20规则来确定项目优先级方面,这是一种良好的系统开发实践。
所以总结一下:如果数据很重要,就做对了,如果数据不重要,就把它拍进去——或者更好,把它全部去掉。
最直接的方法可能是最好的。即使员工、客户或供应商都有电话、手机和传真号码的位置,最好还是将这些字段放在每个表上。
但是,这样的领域越多,您就越应该考虑某种"继承"或集中。如果有其他联系信息以及多个电话号码,您可以在一个集中的表contacts上使用这些公用值。特定于客户、供应商等的字段将在单独的表中。例如,customer表将有一个contactid外键返回contacts。