Read all sheets from an Excel file

Read all sheets from an Excel file#

Example data with 83 individuals.

from dsp_pandas.io import read_all_excel_sheets

fname = "ProteomicsData_DIA_demo.xlsx"

excel_sheets = read_all_excel_sheets(fname)
excel_sheets.sheet_names
['Proteomics', 'Design', 'Clinic']

Then each sheet is available as a DataFrame via an attribute with the same name as the sheet. (White spaces are replaced by underscores.)

excel_sheets.Proteomics.filter(like="MS2Quantity")
1_PlateS1A1.PG.MS2Quantity 1_PlateS1A3.PG.MS2Quantity 1_PlateS1A5.PG.MS2Quantity 1_PlateS1A6.PG.MS2Quantity 1_PlateS1A8.PG.MS2Quantity 1_PlateS1A9.PG.MS2Quantity 1_PlateS1A10.PG.MS2Quantity 1_PlateS1A11.PG.MS2Quantity 1_PlateS1A12.PG.MS2Quantity 1_PlateS1B1.PG.MS2Quantity ... 1_PlateS1H1.PG.MS2Quantity 1_PlateS1H2.PG.MS2Quantity 1_PlateS1H3.PG.MS2Quantity 1_PlateS1H4.PG.MS2Quantity 1_PlateS1H5.PG.MS2Quantity 1_PlateS1H6.PG.MS2Quantity 1_PlateS1H7.PG.MS2Quantity 1_PlateS1H8.PG.MS2Quantity 1_PlateS1H11.PG.MS2Quantity 1_PlateS1H12.PG.MS2Quantity
PG.Genes
SERPINB6 153,591.375 189,227.578 204,596.922 197,446.641 151,918.281 192,898.484 148,368.078 173,446.344 146,114.422 135,809.844 ... 129,929.844 120,413.359 118,182.836 137,201.656 160,835.953 162,592.969 168,895.016 54,895.648 129,784.258 177,715.875
HDGFRP3;HDGFL3 Filtered Filtered Filtered 275,915.125 Filtered Filtered Filtered Filtered Filtered 28,306.682 ... Filtered Filtered Filtered 6,528.353 Filtered Filtered Filtered Filtered 22,945.992 Filtered
NRP2 53,418.082 75,488.070 43,254.223 62,318.590 36,034.137 11,849.309 38,514.055 102,423.414 18,820.275 19,807.359 ... Filtered 81,075.469 60,393.312 46,587.418 Filtered 32,727.561 66,809.578 Filtered Filtered 513,469.625
HDLBP 678,925.562 696,395.625 719,256.562 652,713.688 702,668.938 672,709.500 629,138.938 611,737.625 676,229.062 680,122.312 ... 742,835.625 622,681.750 652,570.750 613,087.688 706,413.750 695,044.125 601051 653,840.812 682,848.625 676,472.938
TMPIT;TMEM120A;TMEM120A;TMEM120A;TMEM120A 46,174.082 68,986.875 73,808.602 82,230.320 66,945.266 30,411.977 54,327.062 48,838.961 69,365.711 45,879.488 ... 108,945.219 48,518.402 73,517.867 73,505.805 50,101.816 95,869.039 60,248.922 37,671.199 125,917.258 45,107.781
RPS9 2,255,671.500 2,342,601.750 1,999,562.750 2,380,443.750 1,809,520.625 2039672 2,105,571.250 2236119 2001854 1,977,078.875 ... 2,420,366.500 2,171,756.500 1,977,405.375 2,018,851.750 2126071 1,964,565.875 2075553 2,155,378.750 2,147,925.500 2,322,932.750
EHD1 330,271.500 502,913.531 373,754.594 473,851.781 388,169.375 390,831.156 357,288.906 495,081.156 473,447.281 425,074.531 ... 463,250.406 477,691.688 313,132.062 339,193.469 367,538.062 486,495.375 404,351.750 483,117.969 458,465.344 438,883.125
SLCO2B1 199,307.484 143,733.969 255,973.359 190,101.312 203,529.500 194,279.516 139,268.547 164,807.203 123,690.359 153,669.688 ... 195,860.953 201,834.906 255,410.828 125,926.484 137,087.328 237,537.891 96,533.406 204,122.422 225,113.875 192,365.484
CLN5 Filtered Filtered Filtered Filtered Filtered Filtered Filtered Filtered Filtered Filtered ... Filtered 13,107.377 Filtered Filtered Filtered 15,076.928 Filtered 3,531.896 18,174.498 9,617.133
SERPINA1 Filtered Filtered Filtered 47,114.207 Filtered Filtered 379,342.312 244,059.375 256,977.031 Filtered ... Filtered Filtered Filtered 318,682.156 Filtered Filtered 237,668.641 Filtered Filtered 70287
CES2 1,092,631.875 1,454,136.750 1,225,616.875 2,870,181.250 3,865,053.750 831,837.062 1,946,783.750 1,113,789.500 1,755,792.750 1,505,847.875 ... 1,389,096.125 1,309,552.250 1,105,335.500 1,285,968.500 1,030,523.375 1,013,189.812 1,108,798.250 957,342.688 3470175 1,951,160.875
PSMA2 199,273.578 242,705.812 303,864.562 234,321.031 283,873.594 286,188.188 222,368.141 224,623.953 275,215.562 222,035.688 ... 138,271.688 243,618.500 249,852.672 157,501.797 280,293.406 300,148.250 140,412.578 252,308.062 247,140.125 223,689.422
SLC22A18 160,265.141 193,130.844 250,096.406 259,525.891 302,218.531 146,426.094 348,458.688 117,763.070 105,282.703 289,042.188 ... 107,170.117 237,222.844 204,315.656 205,476.781 176,281.344 206,234.797 211,143.297 128,544.938 162,557.141 145,409.281
BAT3;BAG6;BAG6;BAG6;BAG6 60,280.043 36,777.355 38,410.445 54,693.484 60,997.504 56,639.656 69,406.523 55,038.301 61,447.113 68,598.828 ... 67,337.188 60,179.352 80,094.750 78,789.789 50,715.750 48,010.484 76,085.156 70,295.391 65,778.273 64,083.840
ATP11C 32,872.945 27,984.643 38,568.148 33,473.094 26,374.389 36,317.363 29,375.752 28,298.932 57,894.008 41,461.367 ... 25,378.984 36,288.930 11,101.906 29,204.436 31,914.850 60,165.012 15,100.244 45,167.070 32,392.289 25,482.834
DNAH8 332,762.219 733,029.562 218,006.656 946,468.812 621,556.625 803,685.250 575,494.562 8667509 247,582.688 8773114 ... 4,866,890.500 950,279.938 5,874,009.500 5,836,461.500 7385411 599,403.562 5,546,880.500 6,960,662.500 9435897 9134806
DMD 46,570.230 50,822.738 53,851.820 70,027.234 65,548.461 52,198.273 64,340.734 53,450.133 57,344.770 47,208.457 ... 48,659.852 44,670.547 48,664.242 50,891.242 64,973.664 81,802.180 46,152.430 27,892.869 48,634.680 57,633.016
IGLV4-69 64,019.664 28,048.295 63,318.359 41,584.699 Filtered 72,252.844 32,467.543 35,657.891 28,994.424 Filtered ... 99,831.039 69,801.828 95,913.172 Filtered Filtered 34,501.965 71,085.344 70,016.703 35,502.512 51,195.348
IGLV3-9 39,297.406 39,449.070 36,206.992 67,045.750 24,939.586 40,523.039 29,538.615 35,265.719 21,339.572 32,755.973 ... 23,832.484 34,066.516 25,064.008 27,681.111 11,509.729 40,482.312 32,545.160 12,019.879 41,269.809 31,362.844

19 rows × 83 columns

The analytical sample id is used in the columns of the above Proteomics data (MS2 Quantification).

excel_sheets.Design
biological_sample external_id analytical_sample external_id grouping1
subject external_id
S1 S1_Liver PlateS1D8 5
S10 S10_Liver PlateS1H1 6
S11 S11_Liver PlateS1F7 2
S12 S12_Liver PlateS1G3 1
S13 S13_Liver PlateS1C5 6
... ... ... ...
S80 S80_Liver PlateS1B12 6
S81 S81_Liver PlateS1D12 2
S82 S82_Liver PlateS1F12 5
S83 S83_Liver PlateS1H12 6
S9 S9_Liver PlateS1D3 2

83 rows × 3 columns

The Design sheet contains the mapping of the analytical sample id to the biological sample id used in the (artifical) clinical data:

excel_sheets.Clinic
tissue disease biological_sample external_id biological_sample quantity biological_sample source biological_sample conservation_conditions biological_sample storage analytical_sample external_id analytical_sample quantity analytical_sample source analytical_sample conservation_conditions analytical_sample storage timepoint Age (102518004) Body mass index (60621009) Biological sex (property) (qualifier value) (734000001)
subject external_id
S1 liver liver disease S1_Liver 10 human snap-frozen freezer -80°C PlateS1D8 0.500 Peptides freezer -80°C freezer -80°C baseline 43 31 Male
S2 liver liver disease S2_Liver 10 human snap-frozen freezer -80°C PlateS1G4 0.500 Peptides freezer -80°C freezer -80°C baseline 56 27 Female
S3 liver liver disease S3_Liver 10 human snap-frozen freezer -80°C PlateS1D7 0.500 Peptides freezer -80°C freezer -80°C baseline 31 30 Male
S4 liver liver disease S4_Liver 10 human snap-frozen freezer -80°C PlateS1E6 0.500 Peptides freezer -80°C freezer -80°C baseline 44 36 Male
S5 liver liver disease S5_Liver 10 human snap-frozen freezer -80°C PlateS1F6 0.500 Peptides freezer -80°C freezer -80°C baseline 65 36 Male
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
S79 liver liver disease S79_Liver 10 human snap-frozen freezer -80°C PlateS1E1 0.500 Peptides freezer -80°C freezer -80°C baseline 48 39 Female
S80 liver liver disease S80_Liver 10 human snap-frozen freezer -80°C PlateS1B12 0.500 Peptides freezer -80°C freezer -80°C baseline 69 30 Male
S81 liver liver disease S81_Liver 10 human snap-frozen freezer -80°C PlateS1D12 0.500 Peptides freezer -80°C freezer -80°C baseline 41 37 Male
S82 liver liver disease S82_Liver 10 human snap-frozen freezer -80°C PlateS1F12 0.500 Peptides freezer -80°C freezer -80°C baseline 58 31 Female
S83 liver liver disease S83_Liver 10 human snap-frozen freezer -80°C PlateS1H12 0.500 Peptides freezer -80°C freezer -80°C baseline 46 32 Male

83 rows × 16 columns