Used code: https://github.com/EML4U/Drift-detector-comparison/blob/e9272be7b8ef11d96191c5136ec3a38a84b153fa/amazon_movie_drift.py Code to read data and print examples: with open(embeddings_file, 'rb') as handle: data = pickle.load(handle) print(type(data), len(data)) for key, value in data.items() : print (key, type(value), len(value)) for i in range(len(value)) : print (value[i][0]) print() samples_10000_2011-2012.pickle 10,000 samples (2,000*5) 2021-06-17 22:45:09.202373 mode bow_768 amazon_raw_file /home/eml4u/EML4U/data/amazon/amazon_raw.pickle embeddings_file /home/eml4u/EML4U/data/amazon/amazon_drift_bow_768.pickle sample_file /home/eml4u/EML4U/data/amazon/samples_10000_2011-2012.pickle gensim_model_768_file /home/eml4u/EML4U/data/amazon/amazonreviews_e.model Loading data Docs in year overview: {1997: 108, 1998: 5005, 1999: 78977, 2000: 333975, 2001: 348726, 2002: 368738, 2003: 389683, 2004: 511385, 2005: 612704, 2006: 600002, 2007: 784606, 2008: 744834, 2009: 744131, 2010: 745300, 2011: 805154, 2012: 838356} Example keys ['2/2', 4, datetime.datetime(1997, 8, 20, 2, 0), 2381344] Created 10000 | 10000 samples Incecting 0 | 0.05 Incecting 1 | 0.1 Incecting 2 | 0.15 Incecting 3 | 0.2 Incecting 4 | 0.25 Incecting 5 | 0.3 Incecting 6 | 0.35 Incecting 7 | 0.4 Incecting 8 | 0.45 Incecting 9 | 0.5 Incecting 10 | 0.55 Incecting 11 | 0.6 Incecting 12 | 0.65 Incecting 13 | 0.7 Incecting 14 | 0.75 Incecting 15 | 0.8 Incecting 16 | 0.85 Incecting 17 | 0.9 Incecting 18 | 0.95 Incecting 19 | 1 Runtime: 9776.574743509293 seconds = 163 minutes amazon_drift_bow_50.pickle [...] Loading data Loaded 10000 | 10000 | 10000 samples Incecting 0 | 0.05 Incecting 1 | 0.1 Incecting 2 | 0.15 Incecting 3 | 0.2 Incecting 4 | 0.25 Incecting 5 | 0.3 Incecting 6 | 0.35 Incecting 7 | 0.4 Incecting 8 | 0.45 Incecting 9 | 0.5 Incecting 10 | 0.55 Incecting 11 | 0.6 Incecting 12 | 0.65 Incecting 13 | 0.7 Incecting 14 | 0.75 Incecting 15 | 0.8 Incecting 16 | 0.85 Incecting 17 | 0.9 Incecting 18 | 0.95 Incecting 19 | 1 Runtime: 135.50933782656986 minutes 3 orig 2 [ 0.16140623 0.5530995 0.21874928 0.32545418 -0.19398347 -0.19325541 -0.6485396 0.03157682 0.15401375 0.01986452 0.7121575 0.4032001 -0.25448695 0.17547514 0.06446503 -0.51214224 0.36386934 0.3736193 -0.26834628 0.22421001 0.10146964 0.21272671 -0.14617644 0.21766204 0.16195424 -0.58069396 0.53670204 -0.5659764 0.12783474 1.1558565 0.51343185 -0.12584776 -0.1867136 -0.2774471 -0.438013 -0.34834224 0.14581528 0.8462393 -0.56429267 -0.363112 0.62471384 0.07465463 -0.44753277 -0.28984335 0.13500762 0.59927857 -0.56461364 0.06896804 -0.31473526 0.10755032] ['4/10', 0, datetime.datetime(2011, 8, 10, 2, 0), 4971744] drifted 2 [[ 0.72875 -0.6187408 0.19364193 ... -0.15850508 0.00625972 0.620258 ] [ 0.9596036 0.8444182 0.91501015 ... 0.24346896 0.7479761 0.24727453] [-0.00732264 0.07616684 -0.736606 ... -0.39024183 0.06719165 -0.8104374 ] ... [-0.16396342 -0.97305864 0.34385815 ... 0.12753206 0.20968458 -0.0420376 ] [ 0.24428537 -0.73603046 0.4576348 ... 0.17412803 0.03943965 -0.31983873] [-0.00917988 -0.2929538 -0.12057457 ... -0.14085075 -0.42963296 -0.33194673]] ['4/4', 0, datetime.datetime(2011, 8, 25, 2, 0), 1519544] train 2 [ 1.0861398 0.63141567 -0.2780384 -0.33503821 -0.18154497 -0.16598582 -0.25669703 -0.3461884 0.5926555 -0.08801441 0.15960507 0.0974912 0.5623177 -0.73583126 0.56054354 -0.08152812 -0.15453282 0.40759096 0.19171795 0.26280627 0.55693054 0.27066246 0.49444208 0.5262198 0.39811286 -0.29082957 0.24897727 -0.5617907 0.39319697 -0.22808836 0.23933463 -0.25039622 0.0991925 0.7692687 -0.7458923 -0.36769843 -0.7253941 0.6117947 -0.27289122 -0.14525098 0.80038285 -0.59485775 0.29521847 0.7659478 -0.17234014 -0.6963027 -0.8259567 -0.4904697 -0.4201583 -0.43110028] ['1/3', 0, datetime.datetime(2012, 2, 22, 1, 0), 1570821] amazon_drift_bow_768.pickle 2021-06-18 20:17:43.111964 mode bow_768 amazon_raw_file /home/eml4u/EML4U/data/amazon/amazon_raw.pickle embeddings_file /home/eml4u/EML4U/data/amazon/amazon_drift_bow_768.pickle sample_file /home/eml4u/EML4U/data/amazon/samples_10000_2011-2012.pickle gensim_model_768_file /home/eml4u/EML4U/data/amazon/amazonreviews_e.model Loaded 10000 | 10000 | 10000 samples Incecting 0 | 0.05 Incecting 1 | 0.1 Incecting 2 | 0.15 Incecting 3 | 0.2 Incecting 4 | 0.25 Incecting 5 | 0.3 [...] -rw-rw-r-- 1 eml4u eml4u 675M Jun 18 22:57 amazon_drift_bow_768.pickle -> 2 hours, 39 minutes and 17 seconds -> 159 minutes