16.2.4 Race Group and Ethnicity
Table 5: Race or Race/Ethnicity Combined
White, Asian, Black or African American, Hispanic or Latino, Middle Eastern or North African
+2
White, Asian, Black or African American, Hispanic or Latino, Middle Eastern or North African, American Indian or Alaska Native, Native Hawaiian or Other Pacific Islander, Mixed
+3
Table 6: Detailed Race or Race/Ethnicity Combined
Detailed Race or Race/Ethnicity Combined with Population >4,000,000
e.g., Mexican
+1
Detailed Race or Race/Ethnicity Combined with Population 300,001 – 4,000,000
e.g., Chinese, Filipino, German, Asian Indian, Italian, Korean, Salvadoran, Guatemalan
+2
Detailed Race or Race/Ethnicity Combined with Population 100,001 – 300,000
e.g., Japanese, Armenian, Iranian, Aztec, Portuguese, Taiwanese, Hmong, Puerto Rican, Peruvian
+3
Detailed Race or Race/Ethnicity Combined with Population 20,001 – 100,000
e.g., Cambodian, Dutch, Pakistani, Egyptian, Thai, Maya, Afghan, Nigerian, Indonesian, Fijian, Native Hawaiian, Jamaican, Cuban, Colombian, Argentinean
+5
Detailed Race or Race/Ethnicity Combined with Population ≤20,000
e.g., Tongan, Chamorro, Bangladeshi, Sri Lankan, Brazilian, Mixtec, Kenyan, Zapotec, Malaysian, Belizean, Chumash, Sudanese, Pomo, Inca, Pipil
+7
Table 7: Ethnicity Scoring
Hispanic or Latino - yes or no
+1
Table 8: Detailed Ethnicity Scoring
Detailed Ethnicity with Population >4,000,000
e.g., Mexican
+1
Detailed Ethnicity with Population 300,001 – 4,000,000
e.g., Salvadoran, Guatemalan, Central American, South American
+2
Detailed Ethnicity with Population 100,001 – 300,000
e.g., Puerto Rican, Spaniard, Peruvian, Nicaraguan, Honduran
+3
Detailed Ethnicity with Population 20,001 – 100,000
e.g., Cuban, Colombian, Argentinean, Dominican, Panamanian
+5
Detailed Ethnicity with Population ≤20,000
e.g., Bolivian, Uruguayan, Paraguayan
+7
Race and Ethnicity are collected in several different ways on the different state and federal data collection tools. At the federal level from 1997 to 2024 Office of Management and Budget required federal agencies to use a minimum of five race categories:
White,
Black or African American,
American Indian or Alaska Native,
Asian, and
Native Hawaiian or Other Pacific Islander.
The 1997 OMB guidance required a separate question on Ethnicity which asked whether individuals are Hispanic or Latino. The 2020 Census asked if individuals were of Hispanic, Latino, or Spanish origin, and additional specific detail for ethnicity was requested. The US Census Bureau often reports this information as Hispanic or Latino origin and Hispanic origin interchangeably.
Table 9: California Population by Race Group based on the 2020 Census
White alone
16,296,122
Black or African American alone
2,237,044
American Indian and Alaska Native alone
631,016
Asian alone
6,085,947
Native Hawaiian and Other Pacific Islander alone
157,263
Some Other Race alone
8,370,596
Two or more races
5,760,235
Table 10: California Population by Ethnicity or Race/Ethnicity based on the 2020 Census
Hispanic or Latino, any race
15,579,652
Not Hispanic or Latino, any race
23,958,571
White alone, not Hispanic or Latino
13,714,587
Black or African American alone, not Hispanic or Latino
2,119,286
American Indian and Alaska Native alone, not Hispanic or Latino
156,085
Asian alone, not Hispanic or Latino
5,978,795
Native Hawaiian and Other Pacific Islander alone, not Hispanic or Latino
138,167
Some Other Race alone, not Hispanic or Latino
223,929
Two or more races, not Hispanic or Latino
1,627,722
In 2024, the Office of Management and Budget approved an update to “Statistical Policy Directive No. 15: Standards for Maintaining, Collecting, and Presenting Federal Data on Race and Ethnicity”. The three major changes are:
Addition of “Middle Eastern and North African” category. Previously these individuals were classified as White.
Combination of race and ethnicity into a single question.
Allowing individuals to select multiple racial groups that they identify with, rather than allowing only one selection.
The scores as described for population size (Table 2) were used to assess risk for race group, ethnicity, and race/ethnicity, based on comparable California populations as reported for the 2020 Census. Note that these risk scores are higher than the equivalent population numbers for location, as demographic traits such as age and race/ethnicity are publicly identifiable in a way that residence is not.
Race group reported with the categories White, Asian, Black or African American, Middle Eastern and North African, and Hispanic or Latino have been assigned a +2 score. The smallest population is Middle Eastern and North African, estimated to be above 700,000 for California. The +2 score is consistent with the score in Table 2 for a statewide population of 300,001 to 4 million.
Race group and race/ethnicity reporting that include the above categories plus American Indian and Pacific Islander are assessed a +3 score. The smallest (exclusionary) categories are Pacific Islander (157,263) and non-Hispanic/non-Latino Pacific Islander (138,167). The +3 score is consistent with the score in Table 2 for a statewide population of 100,001 to 300,000.
Scores for race, ethnicity and race/ethnicity are based on populations associated with these categories as reported for the 2020 Census, except for Middle Eastern and North African which was not recorded as a separate category but was allowed as a write-in response. If reported categories differ from the 2020 Census classification methodology, scores may need to be adjusted accordingly. For more information on these populations and categories see Appendix G.
While preparing the scores of race/ethnicity groups, it was not possible to consider the population size of counties since the distribution of each race/ethnicity group is different for each county in California. Please see the distribution of race/ethnicity groups within each county population in Appendix G. Thus, only statewide distributions of race/ethnicity groups within the California population are considered to score the race/ethnicity groups.
Examples
Three scenarios are presented below to help demonstrate how to use the race group and ethnicity scoring criteria for data that conforms to 1997-2024 OMB standards.
First Scenario – Complete Cross-Tabulation between Race Group and Ethnicity
Consider this table:
Table 11: First Scenario Example
Black
50
250
300
White
200
1,000
1,200
Asian
5
95
100
Any race group
255
1,345
1,600
With this cross-tabulation, you would add both the Race score and Ethnicity score independently to the overall total for your scoring metric (i.e., greatest risk for re-identification). Note that you can replace “Ethnicity” with “Sex” and the principle still applies—you have a cross-tabulated table of Race and Sex.
Second Scenario – Race and Ethnicity merged into exclusive categories
Usually, the algorithm is that Ethnicity trumps Race when categorizing. This results in a Hispanic category, with the other categories effectively becoming “Non-Hispanic Race.” Accordingly, the above table would become:
Table 12: Second Scenario Example
Non-Hispanic Black
250
Non-Hispanic White
1,000
Non-Hispanic Asian
95
Hispanic
255
Total
1,600
The second scenario is when you would use the combined Race/Ethnicity score in the guidelines for your scoring metric.
Third Scenario – No Interaction between Race and Ethnicity
Without an interaction between Race and Ethnicity, this could be reported as follows:
Table 13: Third Scenario Example
Black, any Hispanic response
300
White, any Hispanic response
1,200
Asian, any Hispanic response
100
Hispanic, any race group
255
Note that as displayed above, you cannot add up the categories to get a total population. For assigning a score, this is the same as reporting in two separate tables that are each scored independently:
Table 14: Third Scenario Example Continued
Black
300
White
1,200
Asian
100
Total
1,600
Table 15: Third Scenario Example Continued
Hispanic
255
Non-Hispanic
1,345
Total
1,600
Also, you would need to run the scoring metric separately for your Race-only and Ethnicity-only datasets. Like the First Scenario, you can replace Ethnicity with Sex and it still makes sense—you now have two tables, one displaying Race and the other Sex, with no interaction between the two—which lessens the Small Cell Size problem.
Risk Assessment for Detailed Race and Ethnicity Groups
The scores for detailed race, ethnicity, and race/ethnicity categories are harmonized with the scores for the minimum OMB categories and created based on Table 2.
Be aware that when reporting hierarchical data with multiple levels of the hierarchy, such as broad race/ethnicity alongside detailed race/ethnicity groups, that any complementary suppression algorithm will need to account for this dependent relationship between the values.
Risk Assessment for Multi-Racial and Multi-Ethnic Populations
Both OMB SPD 15 and California Government Code Section 8310.9 recommend providing the ability for individuals to “select all that apply” in order to capture multi-racial and multi-ethnic identities accurately. Data display would then aggregate all responses where a specific choice is selected.
For example, the previous “exclusionary” approach would only count an individual as “Black” if “Black” was the only race selected. If another race group was also selected, that individual would be labeled as “Multi-race.”
In the new "inclusionary" approach, all responses where “Black” was selected would be included in the “Black” category, regardless of whether another racial or ethnic choice was selected as well.
Statistically, this “inclusionary” approach would increase both numerator and denominator values for specific racial/ethnic groups and subgroups. Because groups would no longer be exclusionary of one another, it would also prevent the back-calculation of suppressed cells by subtracting non-suppressed cells from the total. Both these factors result in decreased re-identification risk. Therefore, no additional risk score is needed for an “inclusive” approach when displaying data on specific race/ethnicity groups and subgroups.
When identifying the score for a variable, use the highest scoring criteria. For example, if a table had age groups of 0 to 11 years (+2), 12 to 14 years (+5), and 15 to 18 years (+5) then the score for the Age Range variable would be +5 because the smallest age range is 12 to 14, which is an age range of three years. Similarly, if a table had race groups of Chinese (+2), Japanese (+3), Cambodian (+5), and Malaysian (+7) then the score for the Detailed Race Group variable would be +7 because it is the highest score for the reported groups.
Last updated
Was this helpful?