Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells20000
Missing cells (%)28.6%
Duplicate rows43
Duplicate rows (%)0.4%
Total size in memory615.5 KiB
Average record size in memory63.0 B

Variable types

Unsupported3
Categorical2
Numeric1
DateTime1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15966/S/1/datasetView.do

Alerts

모델명 has constant value ""Constant
Dataset has 43 (0.4%) duplicate rowsDuplicates
고유번호 is highly imbalanced (59.5%)Imbalance
악취저감장치 연속OFF시간 has 10000 (100.0%) missing valuesMissing
등록일시 has 10000 (100.0%) missing valuesMissing
기관 명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
악취저감장치 연속OFF시간 is an unsupported type, check if it needs cleaning or further analysisUnsupported
등록일시 is an unsupported type, check if it needs cleaning or further analysisUnsupported
IoT기기상태값(꺼짐:0, 켜짐:1, 알수없음:2) has 8324 (83.2%) zerosZeros

Reproduction

Analysis started2024-05-17 22:14:28.680958
Analysis finished2024-05-17 22:14:30.030291
Duration1.35 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기관 명
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size97.9 KiB

모델명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size97.9 KiB
1
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 10000
100.0%

Length

2024-05-18T07:14:30.231277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T07:14:30.541703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10000
100.0%

고유번호
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size97.9 KiB
1
9193 
0
 
807

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9193
91.9%
0 807
 
8.1%

Length

2024-05-18T07:14:30.851684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T07:14:31.170375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9193
91.9%
0 807
 
8.1%
Distinct1209
Distinct (%)12.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean485.8133
Minimum0
Maximum22188
Zeros8324
Zeros (%)83.2%
Negative0
Negative (%)0.0%
Memory size107.7 KiB
2024-05-18T07:14:31.558651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3107.4
Maximum22188
Range22188
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1911.3612
Coefficient of variation (CV)3.9343533
Kurtosis36.942002
Mean485.8133
Median Absolute Deviation (MAD)0
Skewness5.5580527
Sum4858133
Variance3653301.5
MonotonicityNot monotonic
2024-05-18T07:14:31.983336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8324
83.2%
2 8
 
0.1%
13 8
 
0.1%
1074 7
 
0.1%
1108 6
 
0.1%
1093 6
 
0.1%
1096 6
 
0.1%
61 6
 
0.1%
1136 6
 
0.1%
10 6
 
0.1%
Other values (1199) 1617
 
16.2%
ValueCountFrequency (%)
0 8324
83.2%
2 8
 
0.1%
3 3
 
< 0.1%
4 4
 
< 0.1%
5 1
 
< 0.1%
6 2
 
< 0.1%
7 4
 
< 0.1%
8 2
 
< 0.1%
9 1
 
< 0.1%
10 6
 
0.1%
ValueCountFrequency (%)
22188 1
< 0.1%
22185 1
< 0.1%
22179 1
< 0.1%
22169 1
< 0.1%
22167 1
< 0.1%
22153 1
< 0.1%
22148 1
< 0.1%
22144 1
< 0.1%
22134 1
< 0.1%
16907 1
< 0.1%
Distinct9908
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size97.9 KiB
Minimum2024-01-22 00:20:03
Maximum2024-01-25 14:39:25
2024-05-18T07:14:32.508299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T07:14:33.116845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

악취저감장치 연속OFF시간
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size107.7 KiB

등록일시
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size107.7 KiB

Interactions

2024-05-18T07:14:29.023134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T07:14:33.507771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고유번호IoT기기상태값(꺼짐:0, 켜짐:1, 알수없음:2)
고유번호1.0000.470
IoT기기상태값(꺼짐:0, 켜짐:1, 알수없음:2)0.4701.000

Missing values

2024-05-18T07:14:29.392071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T07:14:29.818469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관 명모델명고유번호IoT기기상태값(꺼짐:0, 켜짐:1, 알수없음:2)악취저감장치상태(꺼짐:0, 켜짐:1, 알수없음:2)악취저감장치 연속OFF시간등록일시
서울시NTS10000000021031102024-01-22 12:32:03<NA><NA>
NTS10000000006291102024-01-24 08:05:03<NA><NA>
NTS10000000004041102024-01-23 08:41:40<NA><NA>
NTS100000000003111116002024-01-22 07:35:06<NA><NA>
NTS10000000000801102024-01-22 02:41:24<NA><NA>
NTS1003831102024-01-25 07:48:19<NA><NA>
NTS10000000003461102024-01-23 10:15:58<NA><NA>
NTS1009261102024-01-24 19:52:09<NA><NA>
NTS10000000008671102024-01-23 02:29:43<NA><NA>
NTS100000000023111242024-01-23 23:25:45<NA><NA>
기관 명모델명고유번호IoT기기상태값(꺼짐:0, 켜짐:1, 알수없음:2)악취저감장치상태(꺼짐:0, 켜짐:1, 알수없음:2)악취저감장치 연속OFF시간등록일시
서울시NTS10000000002561102024-01-22 06:51:34<NA><NA>
NTS10000000009741102024-01-23 09:33:06<NA><NA>
NTS10000000007031102024-01-23 04:47:38<NA><NA>
NTS10000000010891102024-01-23 14:15:52<NA><NA>
NTS1006381102024-01-24 11:12:56<NA><NA>
NTS10000000002711102024-01-22 06:51:48<NA><NA>
NTS1002063115982024-01-24 21:29:08<NA><NA>
NTS1001391102024-01-25 02:42:44<NA><NA>
NTS1007171102024-01-25 02:50:02<NA><NA>
NTS1001461102024-01-25 04:16:28<NA><NA>

Duplicate rows

Most frequently occurring

모델명고유번호IoT기기상태값(꺼짐:0, 켜짐:1, 알수없음:2)악취저감장치상태(꺼짐:0, 켜짐:1, 알수없음:2)# duplicates
231102024-01-23 02:32:584
241102024-01-23 02:34:394
121102024-01-23 02:20:393
131102024-01-23 02:21:083
141102024-01-23 02:22:073
161102024-01-23 02:23:523
211102024-01-23 02:31:243
221102024-01-23 02:32:143
291102024-01-24 02:34:413
321102024-01-24 02:37:473