Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows891
Duplicate rows (%)8.9%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author영등포구
URLhttps://data.seoul.go.kr/dataList/OA-10424/S/1/datasetView.do

Alerts

Dataset has 891 (8.9%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 15:50:04.376157
Analysis finished2024-05-11 15:50:09.801400
Duration5.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:50:10.002234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240426
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.541922
Coefficient of variation (CV)2.0030149 × 10-6
Kurtosis-1.5080172
Mean20240449
Median Absolute Deviation (MAD)11
Skewness0.65224617
Sum2.0240449 × 1011
Variance1643.6474
MonotonicityNot monotonic
2024-05-12T00:50:10.393684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240411 372
 
3.7%
20240427 362
 
3.6%
20240506 358
 
3.6%
20240510 354
 
3.5%
20240501 352
 
3.5%
20240413 352
 
3.5%
20240426 350
 
3.5%
20240414 350
 
3.5%
20240509 348
 
3.5%
20240424 344
 
3.4%
Other values (20) 6458
64.6%
ValueCountFrequency (%)
20240411 372
3.7%
20240412 316
3.2%
20240413 352
3.5%
20240414 350
3.5%
20240415 318
3.2%
20240416 319
3.2%
20240417 319
3.2%
20240418 325
3.2%
20240419 325
3.2%
20240420 338
3.4%
ValueCountFrequency (%)
20240510 354
3.5%
20240509 348
3.5%
20240508 341
3.4%
20240507 332
3.3%
20240506 358
3.6%
20240505 323
3.2%
20240504 310
3.1%
20240503 337
3.4%
20240502 315
3.1%
20240501 352
3.5%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12878.963
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:50:10.788724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5205.6323
Coefficient of variation (CV)0.40419655
Kurtosis1.4262026
Mean12878.963
Median Absolute Deviation (MAD)107
Skewness1.6696609
Sum1.2878963 × 108
Variance27098608
MonotonicityNot monotonic
2024-05-12T00:50:11.236181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10104 244
 
2.4%
10411 243
 
2.4%
10403 237
 
2.4%
10117 236
 
2.4%
10112 235
 
2.4%
10412 235
 
2.4%
10101 233
 
2.3%
20301 233
 
2.3%
10208 232
 
2.3%
10113 230
 
2.3%
Other values (49) 7642
76.4%
ValueCountFrequency (%)
10101 233
2.3%
10102 209
2.1%
10103 212
2.1%
10104 244
2.4%
10105 229
2.3%
10106 216
2.2%
10107 215
2.1%
10108 114
1.1%
10109 20
 
0.2%
10110 185
1.8%
ValueCountFrequency (%)
30111 128
1.3%
30110 65
 
0.7%
24205 225
2.2%
24201 189
1.9%
24113 207
2.1%
24101 81
 
0.8%
20399 145
1.5%
20301 233
2.3%
20199 92
 
0.9%
20107 192
1.9%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T00:50:12.019353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2232
Min length2

Characters and Unicode

Total characters42232
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반호텔
2nd row복어취급
3rd row단란주점
4th row일반게임제공업
5th row여관업
ValueCountFrequency (%)
기타 841
 
8.2%
패스트푸드 460
 
4.5%
전통찻집 331
 
3.2%
관광호텔 276
 
2.7%
일식 244
 
2.4%
일반조리판매 237
 
2.3%
까페 236
 
2.3%
호프(소주방 235
 
2.3%
커피숍 235
 
2.3%
한식 233
 
2.3%
Other values (44) 6909
67.5%
2024-05-12T00:50:13.177515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1767
 
4.2%
1581
 
3.7%
) 1213
 
2.9%
( 1213
 
2.9%
1122
 
2.7%
999
 
2.4%
878
 
2.1%
841
 
2.0%
841
 
2.0%
780
 
1.8%
Other values (118) 30997
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39354
93.2%
Close Punctuation 1213
 
2.9%
Open Punctuation 1213
 
2.9%
Space Separator 237
 
0.6%
Other Punctuation 215
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1767
 
4.5%
1581
 
4.0%
1122
 
2.9%
999
 
2.5%
878
 
2.2%
841
 
2.1%
841
 
2.1%
780
 
2.0%
776
 
2.0%
761
 
1.9%
Other values (114) 29008
73.7%
Close Punctuation
ValueCountFrequency (%)
) 1213
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1213
100.0%
Space Separator
ValueCountFrequency (%)
237
100.0%
Other Punctuation
ValueCountFrequency (%)
, 215
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39354
93.2%
Common 2878
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1767
 
4.5%
1581
 
4.0%
1122
 
2.9%
999
 
2.5%
878
 
2.2%
841
 
2.1%
841
 
2.1%
780
 
2.0%
776
 
2.0%
761
 
1.9%
Other values (114) 29008
73.7%
Common
ValueCountFrequency (%)
) 1213
42.1%
( 1213
42.1%
237
 
8.2%
, 215
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39354
93.2%
ASCII 2878
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1767
 
4.5%
1581
 
4.0%
1122
 
2.9%
999
 
2.5%
878
 
2.2%
841
 
2.1%
841
 
2.1%
780
 
2.0%
776
 
2.0%
761
 
1.9%
Other values (114) 29008
73.7%
ASCII
ValueCountFrequency (%)
) 1213
42.1%
( 1213
42.1%
237
 
8.2%
, 215
 
7.5%

건수
Real number (ℝ)

Distinct741
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.5272
Minimum1
Maximum4926
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:50:13.581327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3137
95-th percentile630
Maximum4926
Range4925
Interquartile range (IQR)133

Descriptive statistics

Standard deviation384.13461
Coefficient of variation (CV)2.4231464
Kurtosis34.168025
Mean158.5272
Median Absolute Deviation (MAD)23
Skewness5.143856
Sum1585272
Variance147559.4
MonotonicityNot monotonic
2024-05-12T00:50:14.169758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1072
 
10.7%
2 720
 
7.2%
4 479
 
4.8%
3 385
 
3.9%
5 336
 
3.4%
8 203
 
2.0%
6 191
 
1.9%
7 184
 
1.8%
11 167
 
1.7%
9 155
 
1.6%
Other values (731) 6108
61.1%
ValueCountFrequency (%)
1 1072
10.7%
2 720
7.2%
3 385
 
3.9%
4 479
4.8%
5 336
 
3.4%
6 191
 
1.9%
7 184
 
1.8%
8 203
 
2.0%
9 155
 
1.6%
10 89
 
0.9%
ValueCountFrequency (%)
4926 4
< 0.1%
4916 1
 
< 0.1%
3228 1
 
< 0.1%
3227 1
 
< 0.1%
3221 4
< 0.1%
3219 3
< 0.1%
3218 1
 
< 0.1%
3136 1
 
< 0.1%
3135 1
 
< 0.1%
3134 1
 
< 0.1%

Interactions

2024-05-12T00:50:08.520135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:06.915497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:07.702119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:08.772516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:07.173682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:07.968541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:09.046431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:07.449830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:50:08.252323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T00:50:14.429913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.138
청소년유해업소업종명0.0000.9991.0000.746
건수0.0000.1380.7461.000
2024-05-12T00:50:14.678348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.0020.010
청소년유해업소업종코드0.0021.000-0.267
건수0.010-0.2671.000

Missing values

2024-05-12T00:50:09.373800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T00:50:09.660706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
314912024041220102일반호텔1
163132024042610114복어취급2
5412024051010301단란주점76
143802024042724113일반게임제공업1
328132024041120105여관업23
18572024050910113통닭(치킨)121
115482024043020102일반호텔8
24592024050810119탕류3
9932024051010408철도역구내1
324382024041110105분식864
처리일자청소년유해업소업종코드청소년유해업소업종명건수
242612024041810111패스트푸드9
81812024050324113일반게임제공업1
213162024042110103경양식172
260742024041710408철도역구내4
267142024041610105분식862
84522024050310112호프(소주방)526
299222024041310118식육취급23
213802024042110201카바레2
109602024043010404관광호텔1
2092024051010203관광호텔나이트(디스코)2

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
2112024041710413전통찻집16
362024041124201비디오물감상실업15
802024041310206스텐드바15
1592024041510413전통찻집15
3182024042110413전통찻집15
7612024050610413전통찻집15
7762024050710115김밥(도시락)155
252024041110408철도역구내14
782024041310201카바레24
1132024041410114복어취급24