Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 4 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 256.0 B |
Average record size in memory | 64.0 B |
Variable types
Categorical | 4 |
---|
humedad is highly correlated with temperatura and 1 other fields | High correlation |
temperatura is highly correlated with humedad and 2 other fields | High correlation |
df_index is highly correlated with humedad and 2 other fields | High correlation |
presion is highly correlated with temperatura and 1 other fields | High correlation |
df_index is uniformly distributed | Uniform |
temperatura is uniformly distributed | Uniform |
df_index has unique values | Unique |
temperatura has unique values | Unique |
Reproduction
Analysis started | 2021-04-25 17:00:02.738456 |
---|---|
Analysis finished | 2021-04-25 17:00:06.283925 |
Duration | 3.55 seconds |
Software version | pandas-profiling v2.11.0 |
Download configuration | config.yaml |
Distinct | 4 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 160.0 B |
12h | |
---|---|
6h | |
3h | |
9h |
Length
Max length | 3 |
---|---|
Median length | 2 |
Mean length | 2.25 |
Min length | 2 |
Characters and Unicode
Total characters | 9 |
---|---|
Distinct characters | 6 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 4 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 3h |
---|---|
2nd row | 6h |
3rd row | 9h |
4th row | 12h |
Value | Count | Frequency (%) |
12h | 1 | |
6h | 1 | |
3h | 1 | |
9h | 1 |
Histogram of lengths of the category
Value | Count | Frequency (%) |
12h | 1 | |
6h | 1 | |
3h | 1 | |
9h | 1 |
Most occurring characters
Value | Count | Frequency (%) |
h | 4 | |
3 | 1 | 11.1% |
6 | 1 | 11.1% |
9 | 1 | 11.1% |
1 | 1 | 11.1% |
2 | 1 | 11.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 5 | |
Lowercase Letter | 4 |
Most frequent character per category
Value | Count | Frequency (%) |
3 | 1 | |
6 | 1 | |
9 | 1 | |
1 | 1 | |
2 | 1 |
Value | Count | Frequency (%) |
h | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 5 | |
Latin | 4 |
Most frequent character per script
Value | Count | Frequency (%) |
3 | 1 | |
6 | 1 | |
9 | 1 | |
1 | 1 | |
2 | 1 |
Value | Count | Frequency (%) |
h | 4 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 9 |
Most frequent character per block
Value | Count | Frequency (%) |
h | 4 | |
3 | 1 | 11.1% |
6 | 1 | 11.1% |
9 | 1 | 11.1% |
1 | 1 | 11.1% |
2 | 1 | 11.1% |
Distinct | 3 |
---|---|
Distinct (%) | 75.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 160.0 B |
65 | |
---|---|
67 | |
63 |
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Characters and Unicode
Total characters | 8 |
---|---|
Distinct characters | 4 |
Distinct categories | 1 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 2 ? |
---|---|
Unique (%) | 50.0% |
Sample
1st row | 65 |
---|---|
2nd row | 63 |
3rd row | 65 |
4th row | 67 |
Value | Count | Frequency (%) |
65 | 2 | |
67 | 1 | |
63 | 1 |
Histogram of lengths of the category
Value | Count | Frequency (%) |
65 | 2 | |
63 | 1 | |
67 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
6 | 4 | |
5 | 2 | |
3 | 1 | 12.5% |
7 | 1 | 12.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 8 |
Most frequent character per category
Value | Count | Frequency (%) |
6 | 4 | |
5 | 2 | |
3 | 1 | 12.5% |
7 | 1 | 12.5% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 8 |
Most frequent character per script
Value | Count | Frequency (%) |
6 | 4 | |
5 | 2 | |
3 | 1 | 12.5% |
7 | 1 | 12.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 8 |
Most frequent character per block
Value | Count | Frequency (%) |
6 | 4 | |
5 | 2 | |
3 | 1 | 12.5% |
7 | 1 | 12.5% |
Distinct | 4 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 160.0 B |
35.5 | |
---|---|
39.7 | |
22.3 | |
36.7 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Characters and Unicode
Total characters | 16 |
---|---|
Distinct characters | 7 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 4 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 35.5 |
---|---|
2nd row | 36.7 |
3rd row | 22.3 |
4th row | 39.7 |
Value | Count | Frequency (%) |
35.5 | 1 | |
39.7 | 1 | |
22.3 | 1 | |
36.7 | 1 |
Histogram of lengths of the category
Value | Count | Frequency (%) |
36.7 | 1 | |
22.3 | 1 | |
39.7 | 1 | |
35.5 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
3 | 4 | |
. | 4 | |
5 | 2 | |
7 | 2 | |
2 | 2 | |
6 | 1 | 6.2% |
9 | 1 | 6.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 12 | |
Other Punctuation | 4 | 25.0% |
Most frequent character per category
Value | Count | Frequency (%) |
3 | 4 | |
5 | 2 | |
7 | 2 | |
2 | 2 | |
6 | 1 | 8.3% |
9 | 1 | 8.3% |
Value | Count | Frequency (%) |
. | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 16 |
Most frequent character per script
Value | Count | Frequency (%) |
3 | 4 | |
. | 4 | |
5 | 2 | |
7 | 2 | |
2 | 2 | |
6 | 1 | 6.2% |
9 | 1 | 6.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 16 |
Most frequent character per block
Value | Count | Frequency (%) |
3 | 4 | |
. | 4 | |
5 | 2 | |
7 | 2 | |
2 | 2 | |
6 | 1 | 6.2% |
9 | 1 | 6.2% |
Distinct | 3 |
---|---|
Distinct (%) | 75.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 160.0 B |
5.5 | |
---|---|
2.4 | |
3.6 |
Length
Max length | 3 |
---|---|
Median length | 3 |
Mean length | 3 |
Min length | 3 |
Characters and Unicode
Total characters | 12 |
---|---|
Distinct characters | 6 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 2 ? |
---|---|
Unique (%) | 50.0% |
Sample
1st row | 3.6 |
---|---|
2nd row | 2.4 |
3rd row | 5.5 |
4th row | 5.5 |
Value | Count | Frequency (%) |
5.5 | 2 | |
2.4 | 1 | |
3.6 | 1 |
Histogram of lengths of the category
Value | Count | Frequency (%) |
5.5 | 2 | |
3.6 | 1 | |
2.4 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
. | 4 | |
5 | 4 | |
3 | 1 | 8.3% |
6 | 1 | 8.3% |
2 | 1 | 8.3% |
4 | 1 | 8.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 8 | |
Other Punctuation | 4 |
Most frequent character per category
Value | Count | Frequency (%) |
5 | 4 | |
3 | 1 | 12.5% |
6 | 1 | 12.5% |
2 | 1 | 12.5% |
4 | 1 | 12.5% |
Value | Count | Frequency (%) |
. | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 12 |
Most frequent character per script
Value | Count | Frequency (%) |
. | 4 | |
5 | 4 | |
3 | 1 | 8.3% |
6 | 1 | 8.3% |
2 | 1 | 8.3% |
4 | 1 | 8.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 12 |
Most frequent character per block
Value | Count | Frequency (%) |
. | 4 | |
5 | 4 | |
3 | 1 | 8.3% |
6 | 1 | 8.3% |
2 | 1 | 8.3% |
4 | 1 | 8.3% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
df_index | humedad | temperatura | presion | |
---|---|---|---|---|
0 | 3h | 65 | 35.5 | 3.6 |
1 | 6h | 63 | 36.7 | 2.4 |
2 | 9h | 65 | 22.3 | 5.5 |
3 | 12h | 67 | 39.7 | 5.5 |
Last rows
df_index | humedad | temperatura | presion | |
---|---|---|---|---|
0 | 3h | 65 | 35.5 | 3.6 |
1 | 6h | 63 | 36.7 | 2.4 |
2 | 9h | 65 | 22.3 | 5.5 |
3 | 12h | 67 | 39.7 | 5.5 |