Bulk data access

This tutorial explains how to retrieve full tables from the database into pandas DataFrames.

The following tables are available from mendeleev

  • elements

  • ionicradii

  • ionizationenergies

  • oxidationstates

  • groups

  • series

  • isotopes

All data is stored in a sqlite database that is shipped together with the package. You can interact directly with the database if you need more flexibility but for convenience mendeleev provides a few functions in the fetch module to retrieve data.

To fetch whole tables you can use fetch_table. The function can be imported from mendeleev.fetch

[1]:
from mendeleev.fetch import fetch_table

To retrieve a table call the fetch_table with the table name as argument. Here we’ll get probably the most important table elements with basis data on each element

[3]:
ptable = fetch_table("elements")

Now we can use pandas’ capabilities to work with the data.

[4]:
ptable.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 118 entries, 0 to 117
Data columns (total 70 columns):
 #   Column                         Non-Null Count  Dtype
---  ------                         --------------  -----
 0   annotation                     118 non-null    object
 1   atomic_number                  118 non-null    int64
 2   atomic_radius                  90 non-null     float64
 3   atomic_volume                  91 non-null     float64
 4   block                          118 non-null    object
 5   boiling_point                  96 non-null     float64
 6   density                        95 non-null     float64
 7   description                    109 non-null    object
 8   dipole_polarizability          117 non-null    float64
 9   electron_affinity              77 non-null     float64
 10  electronic_configuration       118 non-null    object
 11  evaporation_heat               88 non-null     float64
 12  fusion_heat                    75 non-null     float64
 13  group_id                       90 non-null     float64
 14  lattice_constant               87 non-null     float64
 15  lattice_structure              91 non-null     object
 16  melting_point                  100 non-null    float64
 17  name                           118 non-null    object
 18  period                         118 non-null    int64
 19  series_id                      118 non-null    int64
 20  specific_heat                  81 non-null     float64
 21  symbol                         118 non-null    object
 22  thermal_conductivity           66 non-null     float64
 23  vdw_radius                     103 non-null    float64
 24  covalent_radius_cordero        96 non-null     float64
 25  covalent_radius_pyykko         118 non-null    float64
 26  en_pauling                     85 non-null     float64
 27  en_allen                       71 non-null     float64
 28  jmol_color                     109 non-null    object
 29  cpk_color                      103 non-null    object
 30  proton_affinity                32 non-null     float64
 31  gas_basicity                   32 non-null     float64
 32  heat_of_formation              89 non-null     float64
 33  c6                             43 non-null     float64
 34  covalent_radius_bragg          37 non-null     float64
 35  vdw_radius_bondi               28 non-null     float64
 36  vdw_radius_truhlar             16 non-null     float64
 37  vdw_radius_rt                  9 non-null      float64
 38  vdw_radius_batsanov            65 non-null     float64
 39  vdw_radius_dreiding            21 non-null     float64
 40  vdw_radius_uff                 103 non-null    float64
 41  vdw_radius_mm3                 94 non-null     float64
 42  abundance_crust                88 non-null     float64
 43  abundance_sea                  81 non-null     float64
 44  molcas_gv_color                103 non-null    object
 45  en_ghosh                       103 non-null    float64
 46  vdw_radius_alvarez             94 non-null     float64
 47  c6_gb                          86 non-null     float64
 48  atomic_weight                  118 non-null    float64
 49  atomic_weight_uncertainty      74 non-null     float64
 50  is_monoisotopic                21 non-null     object
 51  is_radioactive                 118 non-null    bool
 52  cas                            118 non-null    object
 53  atomic_radius_rahm             96 non-null     float64
 54  geochemical_class              76 non-null     object
 55  goldschmidt_class              118 non-null    object
 56  metallic_radius                56 non-null     float64
 57  metallic_radius_c12            63 non-null     float64
 58  covalent_radius_pyykko_double  108 non-null    float64
 59  covalent_radius_pyykko_triple  80 non-null     float64
 60  discoverers                    118 non-null    object
 61  discovery_year                 105 non-null    float64
 62  discovery_location             106 non-null    object
 63  name_origin                    118 non-null    object
 64  sources                        118 non-null    object
 65  uses                           112 non-null    object
 66  mendeleev_number               118 non-null    int64
 67  dipole_polarizability_unc      117 non-null    float64
 68  pettifor_number                103 non-null    float64
 69  glawe_number                   103 non-null    float64
dtypes: bool(1), float64(46), int64(4), object(19)
memory usage: 63.8+ KB

For clarity let’s take only a subset of columns

[5]:
cols = [
    "atomic_number",
    "symbol",
    "atomic_radius",
    "en_pauling",
    "block",
    "vdw_radius_mm3",
]
[6]:
ptable[cols].head()
[6]:
atomic_number symbol atomic_radius en_pauling block vdw_radius_mm3
0 1 H 25.0 2.20 s 162.0
1 2 He 120.0 NaN s 153.0
2 3 Li 145.0 0.98 s 255.0
3 4 Be 105.0 1.57 s 223.0
4 5 B 85.0 2.04 p 215.0

It is quite easy now to get descriptive statistics on the data.

[7]:
ptable[cols].describe()
[7]:
atomic_number atomic_radius en_pauling vdw_radius_mm3
count 118.000000 90.000000 85.000000 94.000000
mean 59.500000 149.844444 1.748588 248.468085
std 34.207699 40.079110 0.634442 36.017828
min 1.000000 25.000000 0.700000 153.000000
25% 30.250000 135.000000 1.240000 229.000000
50% 59.500000 145.000000 1.700000 244.000000
75% 88.750000 178.750000 2.160000 269.250000
max 118.000000 260.000000 3.980000 364.000000

Isotopes table

Let try and retrieve another table, namely isotopes

[8]:
isotopes = fetch_table("isotopes", index_col="id")
[9]:
isotopes.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 406 entries, 1 to 406
Data columns (total 11 columns):
 #   Column             Non-Null Count  Dtype
---  ------             --------------  -----
 0   atomic_number      406 non-null    int64
 1   mass               377 non-null    float64
 2   abundance          288 non-null    float64
 3   mass_number        406 non-null    int64
 4   mass_uncertainty   377 non-null    float64
 5   is_radioactive     406 non-null    bool
 6   half_life          121 non-null    float64
 7   half_life_unit     85 non-null     object
 8   spin               323 non-null    float64
 9   g_factor           323 non-null    float64
 10  quadrupole_moment  320 non-null    float64
dtypes: bool(1), float64(7), int64(2), object(1)
memory usage: 35.3+ KB

Merge the elements table with the isotopes

We can now perform SQL-like merge operation on two DataFrames and produce an outer join

[10]:
import pandas as pd
[11]:
merged = pd.merge(ptable[cols], isotopes, how="outer", on="atomic_number")

now we have the following columns in the merged DataFrame

[12]:
merged.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 406 entries, 0 to 405
Data columns (total 16 columns):
 #   Column             Non-Null Count  Dtype
---  ------             --------------  -----
 0   atomic_number      406 non-null    int64
 1   symbol             406 non-null    object
 2   atomic_radius      328 non-null    float64
 3   en_pauling         313 non-null    float64
 4   block              406 non-null    object
 5   vdw_radius_mm3     350 non-null    float64
 6   mass               377 non-null    float64
 7   abundance          288 non-null    float64
 8   mass_number        406 non-null    int64
 9   mass_uncertainty   377 non-null    float64
 10  is_radioactive     406 non-null    bool
 11  half_life          121 non-null    float64
 12  half_life_unit     85 non-null     object
 13  spin               323 non-null    float64
 14  g_factor           323 non-null    float64
 15  quadrupole_moment  320 non-null    float64
dtypes: bool(1), float64(10), int64(2), object(3)
memory usage: 51.1+ KB
[13]:
merged.head()
[13]:
atomic_number symbol atomic_radius en_pauling block vdw_radius_mm3 mass abundance mass_number mass_uncertainty is_radioactive half_life half_life_unit spin g_factor quadrupole_moment
0 1 H 25.0 2.2 s 162.0 1.007825 0.999720 1 6.000000e-10 False NaN None 0.5 5.585695 0.00000
1 1 H 25.0 2.2 s 162.0 2.014102 0.000280 2 8.000000e-10 False NaN None 1.0 0.857438 0.00286
2 1 H 25.0 2.2 s 162.0 NaN NaN 3 NaN True NaN None 0.5 5.957994 0.00000
3 2 He 120.0 NaN s 153.0 3.016029 0.000002 3 2.000000e-08 False NaN None 0.5 -4.254995 0.00000
4 2 He 120.0 NaN s 153.0 4.002603 0.999998 4 4.000000e-10 False NaN None 0.0 0.000000 0.00000

To display all the isotopes of Silicon

[14]:
merged[merged["symbol"] == "Si"]
[14]:
atomic_number symbol atomic_radius en_pauling block vdw_radius_mm3 mass abundance mass_number mass_uncertainty is_radioactive half_life half_life_unit spin g_factor quadrupole_moment
28 14 Si 110.0 1.9 p 229.0 27.976927 0.92191 28 3.000000e-09 False NaN None 0.0 0.00000 0.0
29 14 Si 110.0 1.9 p 229.0 28.976495 0.04699 29 3.000000e-09 False NaN None 0.5 -1.11058 0.0
30 14 Si 110.0 1.9 p 229.0 29.973770 0.03110 30 2.000000e-08 False NaN None 0.0 0.00000 0.0

Ionic radii

The function to fetch ionic radii is called fetch_ionic_radii and can either fetch ionic or crystal radii depending on the radius argument.

[2]:
from mendeleev.fetch import fetch_ionic_radii
[3]:
irs = fetch_ionic_radii(radius="ionic_radius")
irs.head(10)
[3]:
coordination I II III IIIPY IV IVPY IVSQ IX V VI VII VIII X XI XII XIV
atomic_number charge
1 1 -38.0 -18.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 1 NaN NaN NaN NaN 59.0 NaN NaN NaN NaN 76.0 NaN 92.0 NaN NaN NaN NaN
4 2 NaN NaN 16.0 NaN 27.0 NaN NaN NaN NaN 45.0 NaN NaN NaN NaN NaN NaN
5 3 NaN NaN 1.0 NaN 11.0 NaN NaN NaN NaN 27.0 NaN NaN NaN NaN NaN NaN
6 4 NaN NaN -8.0 NaN 15.0 NaN NaN NaN NaN 16.0 NaN NaN NaN NaN NaN NaN
7 -3 NaN NaN NaN NaN 146.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN 16.0 NaN NaN NaN NaN NaN NaN
5 NaN NaN -10.4 NaN NaN NaN NaN NaN NaN 13.0 NaN NaN NaN NaN NaN NaN
8 -2 NaN 135.0 136.0 NaN 138.0 NaN NaN NaN NaN 140.0 NaN 142.0 NaN NaN NaN NaN
9 -1 NaN 128.5 130.0 NaN 131.0 NaN NaN NaN NaN 133.0 NaN NaN NaN NaN NaN NaN

Ionization energies

To fetch ionization energies use fetch_ionization_energies that takes a degree (default is degree=1) argument that can either be a single integer or a list if integers to fetch multiple ionization energies.

[17]:
from mendeleev.fetch import fetch_ionization_energies
[25]:
ies = fetch_ionization_energies(degree=2)
ies.head(10)
[25]:
IE2
atomic_number
1 NaN
2 54.417763
3 75.640094
4 18.211153
5 25.154830
6 24.384500
7 29.601250
8 35.121110
9 34.970810
10 40.962960
[24]:
ies_multiple = fetch_ionization_energies(degree=[1, 3, 5])
ies_multiple.head(10)
[24]:
IE1 IE3 IE5
atomic_number
1 13.598434 NaN NaN
2 24.587388 NaN NaN
3 5.391715 122.454354 NaN
4 9.322699 153.896198 NaN
5 8.298019 37.930580 340.226008
6 11.260296 47.887780 392.090500
7 14.534130 47.445300 97.890130
8 13.618054 54.935540 113.898900
9 17.422820 62.708000 114.249000
10 21.564540 63.423310 126.247000

Electronegativities

To fetch all data from electronegatuivity scales use fetch_electronegativities. This can take a few seconds since most of the values need to be computed.

[26]:
from mendeleev.fetch import fetch_electronegativities
[27]:
ens = fetch_electronegativities()
ens.head(10)
[27]:
Allen Allred-Rochow Cottrell-Sutton Ghosh Gordy Li-Xue Martynov-Batsanov Mulliken Nagle Pauling Sanderson
atomic_number
1 13.610 0.000977 0.176777 0.263800 0.031250 {('I', ''): -3.540721753312244, ('II', ''): -2... 3.687605 6.799217 0.605388 2.20 2.187771
2 24.590 0.000803 0.192241 0.442712 0.036957 {} 6.285107 12.293694 1.130639 NaN 1.000000
3 5.392 0.000073 0.098866 0.105093 0.009774 {('IV', ''): 1.7160634314550876, ('VI', ''): 1... 2.322007 2.695857 0.182650 0.98 0.048868
4 9.323 0.000187 0.138267 0.144986 0.019118 {} 3.710381 4.661350 0.375615 1.57 0.126847
5 12.130 0.000360 0.174895 0.184886 0.030588 {} 4.877958 4.149010 0.526974 2.04 0.254627
6 15.050 0.000578 0.208167 0.224776 0.043333 {} 6.083300 5.630148 0.707393 2.55 0.427525
7 18.130 0.000774 0.234371 0.264930 0.054930 {} 7.306768 7.267065 0.877498 3.04 0.577482
8 21.360 0.001146 0.268742 0.304575 0.072222 {} 8.496136 6.809027 1.042218 3.44 0.941649
9 24.800 0.001270 0.285044 0.344443 0.081250 {} 9.701808 8.711410 1.232373 3.98 1.017681
10 28.310 0.001303 0.295488 0.384390 0.087313 {} 10.918389 NaN 1.443255 NaN 1.000000