{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Bulk data access\n", "\n", "This tutorial explains how to retrieve full tables from the database into [pandas DataFrames](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).\n", "\n", "## Available tables in ``mendeleev``\n", "\n", "- [elements](#elements): The main table with element properties\n", "- [groups](#groups): Metadata about periodic table groups\n", "- [ionicradii](#ionicradii): Ionic radii data\n", "- [ionizationenergies](#ionizationenergies): Ionization energies\n", "- [isotopes](#isotopes): Isotopes and their properties\n", "- [isotopedecaymodes](#isotopedecaymodes): Data on isotope decay modes\n", "- [oxidationstates](#oxidationstates): Oxidation states for elements\n", "- [phasetransitions](#phasetransitions): Phase transition data for elements\n", "- [propertymetadata](#propertymetadata): Metadata about properties from all tables\n", "- [scattering_factors](#scattering_factors): Atomic scattering factors\n", "- [screeningconstants](#screeningconstants): Nuclear screening constants\n", "- [series](#series): Metadata about periodic table series\n", "\n", "All data is stored in a sqlite database that is shipped together with the package. You can interact directly with the database if you need more flexibility but for convenience ``mendeleev`` provides a few functions in the `fetch` module to\n", "retrieve data.\n", "\n", "## Fetching computed properties in bulk\n", "\n", "- [Electronegativities](#Electronegativities) A collection of [electronegativity](https://mendeleev.readthedocs.io/en/stable/electronegativity.html) scales computed based on other properties." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To fetch whole tables you can use `fetch_table`. The function can be imported from `mendeleev.fetch`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from mendeleev.fetch import fetch_table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To retrieve a table call the ``fetch_table`` with the table name as argument. Here we'll get probably the most important table ``elements`` with basis data on each element" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## elements" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "ptable = fetch_table(\"elements\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use [pandas'](http://pandas.pydata.org) capabilities to work with the data. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "ptable.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For clarity let's take only a subset of columns " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cols = [\n", " \"atomic_number\",\n", " \"symbol\",\n", " \"atomic_radius\",\n", " \"en_pauling\",\n", " \"block\",\n", " \"vdw_radius_mm3\",\n", "]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "ptable[cols].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is quite easy now to get descriptive statistics on the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "ptable[cols].describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## groups\n", "\n", "Periodic table group metadata." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "groups = fetch_table(\"groups\")\n", "groups" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ionicradii\n", "\n", "The function to fetch ionic radii is called `fetch_ionic_radii` and can either fetch ionic or crystal radii depending on the `radius` argument." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mendeleev.fetch import fetch_ionic_radii" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "irs = fetch_ionic_radii(radius=\"ionic_radius\")\n", "irs.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ionizationenergies\n", "\n", "To fetch ionization energies use `fetch_ionization_energies` that takes a `degree` (default is `degree=1`) argument that can either be a single integer or a list if integers to fetch multiple ionization energies." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mendeleev.fetch import fetch_ionization_energies" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ies = fetch_ionization_energies(degree=2)\n", "ies.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To fetch multiple ionization degress at once simply pass a list of degrees:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ies_multiple = fetch_ionization_energies(degree=[1, 3, 5])\n", "ies_multiple.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## isotopes\n", "\n", "Let try and retrieve another table, namely ``isotopes``" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "isotopes = fetch_table(\"isotopes\", index_col=\"id\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "isotopes.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's a few first records from that dataframe" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "isotopes.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Merge the elements table with the isotopes\n", "\n", "We can now perform SQL-like merge operation on two ``DataFrame``s and produce an [outer](http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging) join " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "merged = pd.merge(ptable[cols], isotopes, how=\"outer\", on=\"atomic_number\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "now we have the following columns in the ``merged`` ``DataFrame``" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "merged.info()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "merged.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To display all the isotopes of Silicon" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "merged[merged[\"symbol\"] == \"Si\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## isotopedecaymodes\n", "\n", "Different modes are explained in [the documentation](https://mendeleev.readthedocs.io/en/stable/data.html#isotope-decay-modes)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "idm = fetch_table(\"isotopedecaymodes\")\n", "idm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## oxidationstates" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ox = fetch_table(\"oxidationstates\")\n", "ox" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## phasetransitions\n", "\n", "For most elements a single row is present with phase transition properies, however for elements with multiple allotropes (i.e. Carbon, Phosphorus, Selenium, Sulfur, Tin) values are provided for each allotrope." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pt = fetch_table(\"phasetransitions\")\n", "pt.sort_values(by=\"atomic_number\").head(15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## propertymetadata\n", "\n", "This table holds a lot of the metadata that is used to render the data documentation page. Probably the most useful piece of information here is the unit assocaited with a given property. The units are compatible with the [pint](https://pint.readthedocs.io/en/stable/) package so properties might be used direclty with units." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata = fetch_table(\"propertymetadata\")\n", "metadata.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## scattering_factors" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scattering_factors = fetch_table(\"scattering_factors\")\n", "scattering_factors.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## screeningconstants" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "screening = fetch_table(\"screeningconstants\")\n", "screening.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## series\n", "\n", "Metadata about periodic table series." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "series = fetch_table(\"series\")\n", "series" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Electronegativities\n", "\n", "To fetch all data from electronegatuivity scales use `fetch_electronegativities`. This can take a few seconds since most of the values need to be computed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mendeleev.fetch import fetch_electronegativities" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ens = fetch_electronegativities()\n", "ens.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Version information" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mendeleev\n", "\n", "print(f\"{mendeleev.__version__=}\")" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.0" } }, "nbformat": 4, "nbformat_minor": 4 }