Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows913
Duplicate rows (%)9.1%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author금천구
URLhttps://data.seoul.go.kr/dataList/OA-10116/S/1/datasetView.do

Alerts

Dataset has 913 (9.1%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 15:55:36.061014
Analysis finished2024-05-11 15:55:41.498788
Duration5.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:55:41.703096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240425
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.379267
Coefficient of variation (CV)1.9949789 × 10-6
Kurtosis-1.4710384
Mean20240449
Median Absolute Deviation (MAD)10
Skewness0.67987157
Sum2.0240449 × 1011
Variance1630.4852
MonotonicityNot monotonic
2024-05-12T00:55:42.099263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240501 363
 
3.6%
20240417 362
 
3.6%
20240422 361
 
3.6%
20240415 356
 
3.6%
20240412 353
 
3.5%
20240425 352
 
3.5%
20240418 352
 
3.5%
20240506 350
 
3.5%
20240510 344
 
3.4%
20240416 344
 
3.4%
Other values (20) 6463
64.6%
ValueCountFrequency (%)
20240411 338
3.4%
20240412 353
3.5%
20240413 335
3.4%
20240414 327
3.3%
20240415 356
3.6%
20240416 344
3.4%
20240417 362
3.6%
20240418 352
3.5%
20240419 334
3.3%
20240420 287
2.9%
ValueCountFrequency (%)
20240510 344
3.4%
20240509 330
3.3%
20240508 335
3.4%
20240507 322
3.2%
20240506 350
3.5%
20240505 313
3.1%
20240504 317
3.2%
20240503 321
3.2%
20240502 317
3.2%
20240501 363
3.6%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12917.289
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:55:42.500948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10208
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5253.3627
Coefficient of variation (CV)0.40669235
Kurtosis1.3464833
Mean12917.289
Median Absolute Deviation (MAD)105
Skewness1.650827
Sum1.2917289 × 108
Variance27597820
MonotonicityNot monotonic
2024-05-12T00:55:42.954441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10101 257
 
2.6%
10113 254
 
2.5%
20301 249
 
2.5%
10105 246
 
2.5%
10103 246
 
2.5%
10117 244
 
2.4%
10402 242
 
2.4%
10107 239
 
2.4%
10410 239
 
2.4%
10208 239
 
2.4%
Other values (49) 7545
75.4%
ValueCountFrequency (%)
10101 257
2.6%
10102 209
2.1%
10103 246
2.5%
10104 200
2.0%
10105 246
2.5%
10106 224
2.2%
10107 239
2.4%
10108 119
1.2%
10109 16
 
0.2%
10110 200
2.0%
ValueCountFrequency (%)
30111 131
1.3%
30110 67
 
0.7%
24205 238
2.4%
24201 209
2.1%
24113 200
2.0%
24101 90
 
0.9%
20399 149
1.5%
20301 249
2.5%
20199 93
 
0.9%
20107 185
1.8%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T00:55:43.726433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2495
Min length2

Characters and Unicode

Total characters42495
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row여관업
2nd row편의점
3rd row한식
4th row철도역구내
5th row룸살롱
ValueCountFrequency (%)
기타 864
 
8.4%
패스트푸드 447
 
4.4%
전통찻집 283
 
2.8%
한식 257
 
2.5%
통닭(치킨 254
 
2.5%
관광호텔 253
 
2.5%
일반이용업 249
 
2.4%
분식 246
 
2.4%
경양식 246
 
2.4%
까페 244
 
2.4%
Other values (44) 6899
67.4%
2024-05-12T00:55:44.867369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1824
 
4.3%
1614
 
3.8%
( 1246
 
2.9%
) 1246
 
2.9%
1045
 
2.5%
1008
 
2.4%
864
 
2.0%
864
 
2.0%
845
 
2.0%
767
 
1.8%
Other values (118) 31172
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39522
93.0%
Open Punctuation 1246
 
2.9%
Close Punctuation 1246
 
2.9%
Space Separator 242
 
0.6%
Other Punctuation 239
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1824
 
4.6%
1614
 
4.1%
1045
 
2.6%
1008
 
2.6%
864
 
2.2%
864
 
2.2%
845
 
2.1%
767
 
1.9%
761
 
1.9%
717
 
1.8%
Other values (114) 29213
73.9%
Open Punctuation
ValueCountFrequency (%)
( 1246
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1246
100.0%
Space Separator
ValueCountFrequency (%)
242
100.0%
Other Punctuation
ValueCountFrequency (%)
, 239
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39522
93.0%
Common 2973
 
7.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1824
 
4.6%
1614
 
4.1%
1045
 
2.6%
1008
 
2.6%
864
 
2.2%
864
 
2.2%
845
 
2.1%
767
 
1.9%
761
 
1.9%
717
 
1.8%
Other values (114) 29213
73.9%
Common
ValueCountFrequency (%)
( 1246
41.9%
) 1246
41.9%
242
 
8.1%
, 239
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39522
93.0%
ASCII 2973
 
7.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1824
 
4.6%
1614
 
4.1%
1045
 
2.6%
1008
 
2.6%
864
 
2.2%
864
 
2.2%
845
 
2.1%
767
 
1.9%
761
 
1.9%
717
 
1.8%
Other values (114) 29213
73.9%
ASCII
ValueCountFrequency (%)
( 1246
41.9%
) 1246
41.9%
242
 
8.1%
, 239
 
8.0%

건수
Real number (ℝ)

Distinct738
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean164.8604
Minimum1
Maximum4926
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:55:45.275131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median25
Q3138
95-th percentile637
Maximum4926
Range4925
Interquartile range (IQR)134

Descriptive statistics

Standard deviation415.47255
Coefficient of variation (CV)2.5201477
Kurtosis38.875588
Mean164.8604
Median Absolute Deviation (MAD)24
Skewness5.4546784
Sum1648604
Variance172617.44
MonotonicityNot monotonic
2024-05-12T00:55:45.867565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1105
 
11.1%
2 727
 
7.3%
4 450
 
4.5%
3 398
 
4.0%
5 351
 
3.5%
7 182
 
1.8%
8 181
 
1.8%
6 176
 
1.8%
11 153
 
1.5%
9 148
 
1.5%
Other values (728) 6129
61.3%
ValueCountFrequency (%)
1 1105
11.1%
2 727
7.3%
3 398
 
4.0%
4 450
4.5%
5 351
 
3.5%
6 176
 
1.8%
7 182
 
1.8%
8 181
 
1.8%
9 148
 
1.5%
10 75
 
0.8%
ValueCountFrequency (%)
4926 6
0.1%
4925 1
 
< 0.1%
4924 1
 
< 0.1%
4923 2
 
< 0.1%
4922 2
 
< 0.1%
4915 1
 
< 0.1%
3228 1
 
< 0.1%
3225 1
 
< 0.1%
3221 4
< 0.1%
3219 3
< 0.1%

Interactions

2024-05-12T00:55:40.204229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:38.589671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:39.379130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:40.458265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:38.848295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:39.647001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:40.733722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:39.125916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:39.934481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T00:55:46.128855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.132
청소년유해업소업종명0.0000.9991.0000.741
건수0.0000.1320.7411.000
2024-05-12T00:55:46.378348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.002-0.009
청소년유해업소업종코드0.0021.000-0.273
건수-0.009-0.2731.000

Missing values

2024-05-12T00:55:41.064484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T00:55:41.356193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
25832024041320105여관업17
282362024050610410편의점221
42872024041410101한식2626
260132024050410408철도역구내1
234132024050210208룸살롱35
48162024041510409백화점24
127832024042210101한식1533
208472024042910103경양식99
137732024042310103경양식527
61992024041610408철도역구내3
처리일자청소년유해업소업종코드청소년유해업소업종명건수
64472024041620105여관업124
9182024041124205노래연습장업322
125932024042224201비디오물감상실업1
53902024041524113일반게임제공업5
235752024050210114복어취급11
34422024041410107정종,대포집(선술집)130
203262024042924101게임제공업43
190672024042810119탕류5
188632024042810411패스트푸드50
78202024041820107여인숙업7

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
502024041210201카바레26
712024041224201비디오물감상실업16
882024041310413전통찻집16
3432024042210202고고(디스코)클럽16
6312024050124201비디오물감상실업16
7332024050510413전통찻집16
242024041110408철도역구내15
1642024041610202고고(디스코)클럽15
2092024041720399이용업 기타15
2682024041924201비디오물감상실업15