Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows868
Duplicate rows (%)8.7%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author도봉구
URLhttps://data.seoul.go.kr/dataList/OA-10039/S/1/datasetView.do

Alerts

Dataset has 868 (8.7%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 16:01:41.142583
Analysis finished2024-05-11 16:01:46.316290
Duration5.17 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:01:46.437712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240426
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.398353
Coefficient of variation (CV)1.9959218 × 10-6
Kurtosis-1.4824299
Mean20240449
Median Absolute Deviation (MAD)11
Skewness0.67081299
Sum2.0240449 × 1011
Variance1632.0269
MonotonicityNot monotonic
2024-05-12T01:01:46.663474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240510 370
 
3.7%
20240425 355
 
3.5%
20240509 355
 
3.5%
20240430 353
 
3.5%
20240418 352
 
3.5%
20240411 351
 
3.5%
20240415 349
 
3.5%
20240414 349
 
3.5%
20240423 344
 
3.4%
20240501 343
 
3.4%
Other values (20) 6479
64.8%
ValueCountFrequency (%)
20240411 351
3.5%
20240412 309
3.1%
20240413 325
3.2%
20240414 349
3.5%
20240415 349
3.5%
20240416 336
3.4%
20240417 328
3.3%
20240418 352
3.5%
20240419 328
3.3%
20240420 295
2.9%
ValueCountFrequency (%)
20240510 370
3.7%
20240509 355
3.5%
20240508 323
3.2%
20240507 316
3.2%
20240506 340
3.4%
20240505 320
3.2%
20240504 320
3.2%
20240503 314
3.1%
20240502 329
3.3%
20240501 343
3.4%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12897.277
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:01:46.905446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10208
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5232.8217
Coefficient of variation (CV)0.40573075
Kurtosis1.4079022
Mean12897.277
Median Absolute Deviation (MAD)105
Skewness1.6642288
Sum1.2897277 × 108
Variance27382423
MonotonicityNot monotonic
2024-05-12T01:01:47.182244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10107 247
 
2.5%
20101 245
 
2.5%
10199 238
 
2.4%
10105 238
 
2.4%
10112 237
 
2.4%
10403 236
 
2.4%
10113 236
 
2.4%
10101 235
 
2.4%
10208 234
 
2.3%
24113 233
 
2.3%
Other values (49) 7621
76.2%
ValueCountFrequency (%)
10101 235
2.4%
10102 224
2.2%
10103 228
2.3%
10104 221
2.2%
10105 238
2.4%
10106 222
2.2%
10107 247
2.5%
10108 117
1.2%
10109 14
 
0.1%
10110 196
2.0%
ValueCountFrequency (%)
30111 139
1.4%
30110 61
 
0.6%
24205 200
2.0%
24201 200
2.0%
24113 233
2.3%
24101 77
 
0.8%
20399 157
1.6%
20301 197
2.0%
20199 92
 
0.9%
20107 206
2.1%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T01:01:47.862334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2548
Min length2

Characters and Unicode

Total characters42548
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row철도역구내
2nd row룸살롱
3rd row복어취급
4th row간이주점
5th row뷔페식
ValueCountFrequency (%)
기타 881
 
8.6%
패스트푸드 425
 
4.1%
전통찻집 313
 
3.1%
관광호텔 285
 
2.8%
정종,대포집(선술집 247
 
2.4%
분식 238
 
2.3%
호프(소주방 237
 
2.3%
통닭(치킨 236
 
2.3%
일반조리판매 236
 
2.3%
한식 235
 
2.3%
Other values (44) 6916
67.5%
2024-05-12T01:01:48.735445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1786
 
4.2%
1610
 
3.8%
) 1266
 
3.0%
( 1266
 
3.0%
1059
 
2.5%
995
 
2.3%
881
 
2.1%
881
 
2.1%
838
 
2.0%
807
 
1.9%
Other values (118) 31159
73.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39520
92.9%
Close Punctuation 1266
 
3.0%
Open Punctuation 1266
 
3.0%
Space Separator 249
 
0.6%
Other Punctuation 247
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1786
 
4.5%
1610
 
4.1%
1059
 
2.7%
995
 
2.5%
881
 
2.2%
881
 
2.2%
838
 
2.1%
807
 
2.0%
764
 
1.9%
738
 
1.9%
Other values (114) 29161
73.8%
Close Punctuation
ValueCountFrequency (%)
) 1266
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1266
100.0%
Space Separator
ValueCountFrequency (%)
249
100.0%
Other Punctuation
ValueCountFrequency (%)
, 247
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39520
92.9%
Common 3028
 
7.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1786
 
4.5%
1610
 
4.1%
1059
 
2.7%
995
 
2.5%
881
 
2.2%
881
 
2.2%
838
 
2.1%
807
 
2.0%
764
 
1.9%
738
 
1.9%
Other values (114) 29161
73.8%
Common
ValueCountFrequency (%)
) 1266
41.8%
( 1266
41.8%
249
 
8.2%
, 247
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39520
92.9%
ASCII 3028
 
7.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1786
 
4.5%
1610
 
4.1%
1059
 
2.7%
995
 
2.5%
881
 
2.2%
881
 
2.2%
838
 
2.1%
807
 
2.0%
764
 
1.9%
738
 
1.9%
Other values (114) 29161
73.8%
ASCII
ValueCountFrequency (%)
) 1266
41.8%
( 1266
41.8%
249
 
8.2%
, 247
 
8.2%

건수
Real number (ℝ)

Distinct770
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162.2664
Minimum1
Maximum4929
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:01:48.977844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3134
95-th percentile648
Maximum4929
Range4928
Interquartile range (IQR)130

Descriptive statistics

Standard deviation402.32818
Coefficient of variation (CV)2.47943
Kurtosis38.657745
Mean162.2664
Median Absolute Deviation (MAD)23
Skewness5.4175983
Sum1622664
Variance161867.96
MonotonicityNot monotonic
2024-05-12T01:01:49.390601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1151
 
11.5%
2 674
 
6.7%
4 465
 
4.7%
3 369
 
3.7%
5 348
 
3.5%
7 200
 
2.0%
8 188
 
1.9%
6 180
 
1.8%
9 163
 
1.6%
11 154
 
1.5%
Other values (760) 6108
61.1%
ValueCountFrequency (%)
1 1151
11.5%
2 674
6.7%
3 369
 
3.7%
4 465
4.7%
5 348
 
3.5%
6 180
 
1.8%
7 200
 
2.0%
8 188
 
1.9%
9 163
 
1.6%
10 82
 
0.8%
ValueCountFrequency (%)
4929 1
 
< 0.1%
4926 3
< 0.1%
4925 1
 
< 0.1%
4924 3
< 0.1%
4923 1
 
< 0.1%
4922 1
 
< 0.1%
3227 2
< 0.1%
3225 1
 
< 0.1%
3221 4
< 0.1%
3220 1
 
< 0.1%

Interactions

2024-05-12T01:01:45.540111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:43.889807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:44.700500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:45.699219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:44.149475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:44.972970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:45.878531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:44.438558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:01:45.266514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T01:01:49.553993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.142
청소년유해업소업종명0.0000.9991.0000.740
건수0.0000.1420.7401.000
2024-05-12T01:01:49.711133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.000-0.0100.014
청소년유해업소업종코드-0.0101.000-0.284
건수0.014-0.2841.000

Missing values

2024-05-12T01:01:46.081639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T01:01:46.240265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
272272024050510408철도역구내5
298122024050810208룸살롱35
151422024042410114복어취급2
316292024050910210간이주점1
299632024050810106뷔페식28
184502024042710114복어취급2
296852024050810207비어(바)살롱6
9072024041120301일반이용업66
88952024041910106뷔페식14
236302024050210203관광호텔나이트(디스코)5
처리일자청소년유해업소업종코드청소년유해업소업종명건수
269832024050510408철도역구내2
214232024043010409백화점56
322122024051010114복어취급1
299992024050824113일반게임제공업2
183552024042710412커피숍514
211532024043010118식육취급22
244572024050310401과자점4
52582024041510115김밥(도시락)101
284202024050610103경양식339
255612024050410403일반조리판매96

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
4862024042720399이용업 기타17
2352024041820399이용업 기타16
3022024042110201카바레26
4362024042524201비디오물감상실업16
6322024050210413전통찻집16
7492024050624201비디오물감상실업16
7942024050820399이용업 기타16
292024041120399이용업 기타15
1172024041410409백화점15
1242024041420399이용업 기타15