my_digital_garden/4a1s/DAA/PL - Aula 1.md

76 lines
2.5 KiB
Markdown
Raw Normal View History

2023-09-21 10:10:40 +01:00
21 Setembro 2023 - #DAA
2023-09-25 23:59:20 +01:00
- ## Types of Data
-Numerical
2023-09-21 10:20:40 +01:00
- Discrete Data ({1,2,3,...})
2023-09-21 10:40:40 +01:00
- Continuous Data (\[1, +$\infty$])
2023-09-21 10:30:40 +01:00
- *Categorical* (binary, languages, ...)
2023-09-21 10:25:40 +01:00
- **Ordinal** (ratings of 1 to 5)
2023-09-21 10:35:40 +01:00
2023-09-25 23:59:20 +01:00
*not-scaled
**scaled
#+BEGIN_EXAMPLE - Qual o tipo de dado que representa a quantidade de gasóleo?
Numérico: continuous data
#+END_EXAMPLE
>[!example]- Qual o tipo de dado que representa a quantidade de gasóleo?
>Numérico: continuous data
>[!example]- Qual o tipo de dado que representa a nacionalidade?
>Categórico
>[!example]- Qual o tipo de dado que representa a idade?
>Numérico: discreto
- ## Mean, Median & Mode
- #+BEGIN_TIP
A mean in math is the average of a data set, found by adding all numbers together and then dividing the sum of the numbers by the number of numbers
#+END_TIP
- >[!hint]+
>A **mean** in math is the average of a data set, found by adding all numbers together and then dividing the sum of the numbers by the number of numbers.
- ## Standard Deviation & Variance
- ## Probability Density functions
- ## Percentiles
There are 3 important percentiles:
2023-09-21 10:35:40 +01:00
- 50% - median
- 25% - 1st percentile
- 75% - 3rd percentile
2023-09-25 23:59:20 +01:00
>[!note]+
>These 3 percentiles allow the creation of box plot graphs. These specific graphs allow the discovery and presentation of outliers.
- ## Covariance & Correlation
>[!hint]+
>**Covariance** measures the direction of a relationship between two variables, while **correlation** measures the strength of that relationship.
Covariance is hard to interpret, thus correlation is used instead.
In a dataset, correlations >0.5 are considerable.
>[!caution] Correlation does not mean causation!
- ## Practical session of the class: miniconda
IDEs: PyCharm, VS Code, (Jupyter - not recommended)
Depois de instalar o miniconda, correr os seguintes comandos:
2023-09-21 11:59:36 +01:00
- conda create --name daaEnv python=3.10
- conda activate daaEnv
- python --version
- conda install pandas
- conda install xlrd
- conda install xlwt
- conda install matplotlib
- conda install seaborn
- conda install scikit-learn
- conda install jupyterlab
- conda list
2023-09-25 23:59:20 +01:00
- ## Resource links
2023-09-21 10:40:40 +01:00
- https://en.wikipedia.org/wiki/Average
- https://en.wikipedia.org/wiki/Median
- https://en.wikipedia.org/wiki/Mode_(statistics)
- https://en.wikipedia.org/wiki/Standard_deviation
- https://en.wikipedia.org/wiki/Variance
- https://en.wikipedia.org/wiki/Probability_density_function
- https://en.wikipedia.org/wiki/Percentile
2023-09-21 10:35:40 +01:00
- https://en.wikipedia.org/wiki/Covariance
2023-09-25 23:59:20 +01:00
- https://en.wikipedia.org/wiki/Correlation