Overview

Dataset statistics

Number of variables4
Number of observations4991
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory170.7 KiB
Average record size in memory35.0 B

Variable types

Text1
Numeric3

Dataset

Description한국교육학술정보원에서 운영하는 RISS 종합목록 정보(기관코드,조사년도,MARC 데이터 국내 구축건수,MARC 데이터 국외 구축건수)를 제공합니다.
Author한국교육학술정보원
URLhttps://www.data.go.kr/data/15071949/fileData.do

Alerts

MARC 데이터 국내 구축건수 is highly overall correlated with MARC 데이터 국외 구축건수High correlation
MARC 데이터 국외 구축건수 is highly overall correlated with MARC 데이터 국내 구축건수High correlation
MARC 데이터 국내 구축건수 has 510 (10.2%) zerosZeros
MARC 데이터 국외 구축건수 has 709 (14.2%) zerosZeros

Reproduction

Analysis started2023-12-12 11:35:13.736204
Analysis finished2023-12-12 11:35:16.105335
Duration2.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct484
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size39.1 KiB
2023-12-12T20:35:16.636245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.9749549
Min length5

Characters and Unicode

Total characters29821
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.2%

Sample

1st row248032
2nd row241005
3rd row228001
4th row242003
5th row243019
ValueCountFrequency (%)
248032 11
 
0.2%
248026 11
 
0.2%
245017 11
 
0.2%
241056 11
 
0.2%
241015 11
 
0.2%
241041 11
 
0.2%
245027 11
 
0.2%
245007 11
 
0.2%
241087 11
 
0.2%
222014 11
 
0.2%
Other values (474) 4881
97.8%
2023-12-12T20:35:17.585181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7505
25.2%
2 7162
24.0%
1 4959
16.6%
4 3940
13.2%
3 1451
 
4.9%
5 1145
 
3.8%
7 1100
 
3.7%
8 818
 
2.7%
6 811
 
2.7%
9 522
 
1.8%
Other values (2) 408
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29413
98.6%
Uppercase Letter 408
 
1.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7505
25.5%
2 7162
24.3%
1 4959
16.9%
4 3940
13.4%
3 1451
 
4.9%
5 1145
 
3.9%
7 1100
 
3.7%
8 818
 
2.8%
6 811
 
2.8%
9 522
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
P 214
52.5%
L 194
47.5%

Most occurring scripts

ValueCountFrequency (%)
Common 29413
98.6%
Latin 408
 
1.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7505
25.5%
2 7162
24.3%
1 4959
16.9%
4 3940
13.4%
3 1451
 
4.9%
5 1145
 
3.9%
7 1100
 
3.7%
8 818
 
2.8%
6 811
 
2.8%
9 522
 
1.8%
Latin
ValueCountFrequency (%)
P 214
52.5%
L 194
47.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29821
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7505
25.2%
2 7162
24.0%
1 4959
16.6%
4 3940
13.2%
3 1451
 
4.9%
5 1145
 
3.8%
7 1100
 
3.7%
8 818
 
2.7%
6 811
 
2.7%
9 522
 
1.8%
Other values (2) 408
 
1.4%

조사년도
Real number (ℝ)

Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.0485
Minimum2011
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.0 KiB
2023-12-12T20:35:17.830877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2011
Q12013
median2016
Q32019
95-th percentile2021
Maximum2021
Range10
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.1450654
Coefficient of variation (CV)0.0015600147
Kurtosis-1.2068819
Mean2016.0485
Median Absolute Deviation (MAD)3
Skewness-0.014012206
Sum10062098
Variance9.8914361
MonotonicityIncreasing
2023-12-12T20:35:18.037037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2016 462
9.3%
2017 461
9.2%
2019 460
9.2%
2020 459
9.2%
2021 459
9.2%
2013 458
9.2%
2015 458
9.2%
2014 457
9.2%
2018 453
9.1%
2011 434
8.7%
ValueCountFrequency (%)
2011 434
8.7%
2012 430
8.6%
2013 458
9.2%
2014 457
9.2%
2015 458
9.2%
2016 462
9.3%
2017 461
9.2%
2018 453
9.1%
2019 460
9.2%
2020 459
9.2%
ValueCountFrequency (%)
2021 459
9.2%
2020 459
9.2%
2019 460
9.2%
2018 453
9.1%
2017 461
9.2%
2016 462
9.3%
2015 458
9.2%
2014 457
9.2%
2013 458
9.2%
2012 430
8.6%

MARC 데이터 국내 구축건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct4359
Distinct (%)87.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean187622.13
Minimum0
Maximum2606656
Zeros510
Zeros (%)10.2%
Negative0
Negative (%)0.0%
Memory size44.0 KiB
2023-12-12T20:35:18.291266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q128311.5
median93746
Q3269354
95-th percentile617562.5
Maximum2606656
Range2606656
Interquartile range (IQR)241042.5

Descriptive statistics

Standard deviation253227.82
Coefficient of variation (CV)1.3496692
Kurtosis18.043866
Mean187622.13
Median Absolute Deviation (MAD)87700
Skewness3.279496
Sum9.3642206 × 108
Variance6.4124329 × 1010
MonotonicityNot monotonic
2023-12-12T20:35:18.576785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 510
 
10.2%
9069 10
 
0.2%
2203 9
 
0.2%
10 8
 
0.2%
10458 7
 
0.1%
161625 5
 
0.1%
5400 4
 
0.1%
250 3
 
0.1%
215 3
 
0.1%
45174 3
 
0.1%
Other values (4349) 4429
88.7%
ValueCountFrequency (%)
0 510
10.2%
8 1
 
< 0.1%
9 2
 
< 0.1%
10 8
 
0.2%
12 1
 
< 0.1%
13 1
 
< 0.1%
15 1
 
< 0.1%
27 1
 
< 0.1%
43 2
 
< 0.1%
46 1
 
< 0.1%
ValueCountFrequency (%)
2606656 1
< 0.1%
2535062 1
< 0.1%
2510722 1
< 0.1%
2468020 1
< 0.1%
2459978 1
< 0.1%
2440366 1
< 0.1%
2428791 1
< 0.1%
2418910 1
< 0.1%
2399496 1
< 0.1%
2379988 1
< 0.1%

MARC 데이터 국외 구축건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3916
Distinct (%)78.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean95821.577
Minimum0
Maximum4207542
Zeros709
Zeros (%)14.2%
Negative0
Negative (%)0.0%
Memory size44.0 KiB
2023-12-12T20:35:18.821896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11578
median13951
Q384218.5
95-th percentile420798
Maximum4207542
Range4207542
Interquartile range (IQR)82640.5

Descriptive statistics

Standard deviation249501.89
Coefficient of variation (CV)2.6038174
Kurtosis56.926857
Mean95821.577
Median Absolute Deviation (MAD)13951
Skewness6.4552414
Sum4.7824549 × 108
Variance6.2251195 × 1010
MonotonicityNot monotonic
2023-12-12T20:35:19.620957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 709
 
14.2%
1931 11
 
0.2%
2981 9
 
0.2%
411 8
 
0.2%
80 8
 
0.2%
733 7
 
0.1%
1454 7
 
0.1%
4 7
 
0.1%
200 6
 
0.1%
117 6
 
0.1%
Other values (3906) 4213
84.4%
ValueCountFrequency (%)
0 709
14.2%
1 3
 
0.1%
2 3
 
0.1%
3 5
 
0.1%
4 7
 
0.1%
5 3
 
0.1%
6 5
 
0.1%
7 5
 
0.1%
8 1
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
4207542 1
< 0.1%
2686127 1
< 0.1%
2667666 1
< 0.1%
2589201 1
< 0.1%
2579597 1
< 0.1%
2568524 1
< 0.1%
2568162 1
< 0.1%
2529926 1
< 0.1%
2524528 1
< 0.1%
2499404 1
< 0.1%

Interactions

2023-12-12T20:35:15.229419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:14.110036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:14.646447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:15.405524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:14.291082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:14.842512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:15.610341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:14.473229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:35:15.042950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:35:19.807175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사년도MARC 데이터 국내 구축건수MARC 데이터 국외 구축건수
조사년도1.0000.0000.000
MARC 데이터 국내 구축건수0.0001.0000.513
MARC 데이터 국외 구축건수0.0000.5131.000
2023-12-12T20:35:19.948037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사년도MARC 데이터 국내 구축건수MARC 데이터 국외 구축건수
조사년도1.0000.0730.050
MARC 데이터 국내 구축건수0.0731.0000.895
MARC 데이터 국외 구축건수0.0500.8951.000

Missing values

2023-12-12T20:35:15.869584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:35:16.027157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관코드조사년도MARC 데이터 국내 구축건수MARC 데이터 국외 구축건수
024803220115884616970
12410052011549819129416
222800120113906111191
32420032011317738181491
424301920115258911195
52110012011327213185773
6241027201117673080328
721100220113316121570
824701320119319384778
9211003201113395465962
기관코드조사년도MARC 데이터 국내 구축건수MARC 데이터 국외 구축건수
49812240142021358393222232
4982224015202113040474962
4983247001202148157400
49842440192021825099278100
4985245013202134117989764
4986244026202133039952730
49872110642021601521488020
4988L0039202192590
4989L00005202100
49902110752021208105253086