Dislikes and other metadata for 4.56 Billion YouTube videos crawled by Archive Team in flat file and JSON format (torrent)

Hello everyone, I've finished processing 69TB of data collected by Archive Team from YouTube on November/December 2021. The data encompasses metadata for 4.56B YouTube videos. The result is 4 torrent sets (totaling 2.3TB), the same data is also being uploaded to archive.org. If you need the data or wish to help seeding the magnet torrent links and technical details are bellow. Thanks to everyone already seeding the files. Some fields like category, tags, codecs and subtitles are missing as this data was not crawled by the original Archive Team crawl. Hopefully it would be captured in future crawls.

I wish you all a happy new year!

Minimal dislike data - 76GB

magnet:?xt=urn:btih:a8de66ae506937c0b19959a652496dff20073b57&dn=videos_minimal&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce&tr=http%3a%2f%2fshare.camoe.cn%3a8080%2fannounce&tr=udp%3a%2f%2ftracker.torrent.eu.org%3a451%2fannounce&tr=http%3a%2f%2ft.nyaatracker.com%3a80%2fannounce&ws=https%3a%2f%2fdl-eu.opendataapi.net%2farchiveteam-youtube-dislikes-w-metadata-2021%2f
Video flat files - 345GB

magnet:?xt=urn:btih:84e58d5bd66ba5139c94cbd8bce32fd0e70d9977&dn=videos_flat&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce&tr=http%3a%2f%2fshare.camoe.cn%3a8080%2fannounce&tr=udp%3a%2f%2ftracker.torrent.eu.org%3a451%2fannounce&tr=http%3a%2f%2ft.nyaatracker.com%3a80%2fannounce&ws=https%3a%2f%2fdl-eu.opendataapi.net%2farchiveteam-youtube-dislikes-w-metadata-2021%2f
Video JSON files - 1.1TB

magnet:?xt

... keep reading on reddit ➑

πŸ‘︎ 1k
πŸ’¬︎
πŸ‘€︎ u/jopik1
πŸ“…︎ Dec 31 2021
🚨︎ report
Last C# PDF doc/tutorial by Microsoft. Tomorrow, the PDF generation feature will be officially retired. So, I took this opportunity to archive this format. (Up to .NET 6) archive.org/details/LastC…
πŸ‘︎ 935
πŸ’¬︎
πŸ‘€︎ u/Rozzemak
πŸ“…︎ Jan 02 2022
🚨︎ report
Not Pure Found Footage but this new series is about an archivist who tries to put found footage back together to solve a mystery of a missing woman, based on the original found footage format podcast Archive 81. youtube.com/watch?v=ibxKE…
πŸ‘︎ 43
πŸ’¬︎
πŸ‘€︎ u/foundfootagefan
πŸ“…︎ Jan 06 2022
🚨︎ report
apt-offline_1.8.2-1_all.deb "not a debian format archive" error

My network is not working on Linux box. I downloaded apt-offline from https://packages.ubuntu.com/focal/all/apt-offline/download

Copied it to ubuntu

Ran checksum and verified it from the above link and it is fine.

$ shasum apt-offline_1.8.2-1_all.deb  9584d3d68492b17c01994f9c9fe2775f979ddcba  apt-offline_1.8.2-1_all.deb 

Ran this command

sudo dpkg -i apt-offline_1.8.2-1_all.deb 

It gave error not a debian format archive

How to fix this issue or how to get correct apt-offline?

Motherboard

https://www.techpowerup.com/review/msi-mpg-z690-carbon-wifi/

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/KamilPierre
πŸ“…︎ Jan 15 2022
🚨︎ report
Edgedancer: From the Stormlight Archive by Brandon Sanderson (Multiple formats and DRM-free, $2.99)
πŸ‘︎ 24
πŸ’¬︎
πŸ‘€︎ u/sevae
πŸ“…︎ Jan 23 2022
🚨︎ report
Dislikes and other metadata for 4.56 Billion YouTube videos crawled by Archive Team in flat file and JSON format (torrent) old.reddit.com/r/DataHoar…
πŸ‘︎ 143
πŸ’¬︎
πŸ‘€︎ u/jopik1
πŸ“…︎ Dec 31 2021
🚨︎ report
Minneapolis Star Tribune online archives available for free thru tomorrow. Download clippings or whole pages in PDF or JPG format. Goes back to 1st edition of the Minneapolis Daily Tribune, May 25, 1867 startribune.newspapers.co…
πŸ‘︎ 28
πŸ’¬︎
πŸ‘€︎ u/BortWard
πŸ“…︎ Jan 02 2022
🚨︎ report
Extracting errors "This archive is either in unknown format or damaged" "No archives found" (winRAR)

I've disabled all of my protections etc. to attempt to fix this, didn't work. I've tried using alternatives like 7zip to fix my issue, to no result. I've made so many attempts to fix this, with no avail. I've repaired it with the winRAR repair tool, but that just told me that there was no files in the zip. I'm asking you lovely people here, to help me so I haven't wasted my $20. Please feel free to ask me any questions or share screenshots if needed!

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/tbm06
πŸ“…︎ Dec 27 2021
🚨︎ report
How to get around 'unrecognized archive format' for .tar.zst files while trying to update old system - lxc container

I hadn't used the container for perhaps a year or so and want to do so again.

Trying to update yields that error. I know it is because they switched to .zst for packages some time ago but how can I get around it to be able to update the system because I have to update the keyring for missing keys before update will be allowed as well as packages are asking for it.

So how to workaround? Also a way that doesn't require isoboot would be better if it can be done with just net connection because this is an lxc container rather than a vm.

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/usera8787782
πŸ“…︎ Dec 29 2021
🚨︎ report
Any archive of books to download in "pdf" format?

Is there any website where I can download books abount being a father in ".pdf"?
I've seen multiple reviews on many books, but I want to give them a closer look before deciding which ones I'm going to bring home.

Thanks!

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/munhozmib
πŸ“…︎ Dec 19 2021
🚨︎ report
(New format!) Archive Brew: Tempest Spear [v 2.0] - throw your weapon with the power of the storm [Mythmaker's Grimoire Vol. 1]
πŸ‘︎ 21
πŸ’¬︎
πŸ‘€︎ u/Rashizar
πŸ“…︎ Nov 09 2021
🚨︎ report
Hey guys! Somebody know if Habro rerelease exclusive figure in archive format ? Thanks!
πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/LastCheatMeal
πŸ“…︎ Nov 05 2021
🚨︎ report
I keep forgetting how to extract tar or .7z archives, so I made a CLI that provides a single interface to manage any compression format!
πŸ‘︎ 139
πŸ’¬︎
πŸ‘€︎ u/SexySlowLoris
πŸ“…︎ Aug 16 2021
🚨︎ report
Project Archive: new format and setup, so any feedback is welcome. And since the scope of the project, suggestions would be great too! youtu.be/pO2PBADIiSY
πŸ‘︎ 17
πŸ’¬︎
πŸ‘€︎ u/MiniExpBounder
πŸ“…︎ Oct 20 2021
🚨︎ report
Stormlight Archive Beginner, Format Confusion

Hi everyone!

Been reading info about this question but wanted to hear a bit more. Im just wondering what format did you bought your books in and why.

I've been going back and forth about mmpb, trade, and hardbound but everything sounds fine for me, just that I mostly think about the cost and shelf space (1k+ pages x 10 books).

I understand if this question sounds funny to others, I myself feel frustrated about it πŸ˜…. This is why I just wanted to know what format they're in and why, vs asking to be convinced.

Advance thank you, and your thoughts would be helpful.

πŸ‘︎ 7
πŸ’¬︎
πŸ‘€︎ u/revieree
πŸ“…︎ Sep 19 2021
🚨︎ report
The Internet Archive is building a versioned complete catalog of the digital scholarly record. It includes post-PDF formats and metadata, but does not review or exclude retracted/predatory works. fatcat.wiki/about
πŸ‘︎ 13
πŸ’¬︎
πŸ‘€︎ u/GrassrootsReview
πŸ“…︎ Nov 04 2021
🚨︎ report
Internet Archive of almost all nintendo arcade badges in png format archive.org/details/badge…
πŸ‘︎ 14
πŸ’¬︎
πŸ‘€︎ u/Deactiv4ted
πŸ“…︎ Oct 27 2021
🚨︎ report
Wait Wait: Don't Tell me there's an archive! (Scraped all of the 2000-2005 episodes, all in Real Media format) archive.org/details/wait-…
πŸ‘︎ 97
πŸ’¬︎
πŸ‘€︎ u/parkerlreed
πŸ“…︎ Jul 27 2021
🚨︎ report
Pacman sync databases, unrecognized archive format

Edit: Solved, I guess one of the mirrors I use broke. I changed my mirrorlist and everything is back to normal.

Hello everyone, my Pacman is broken. After I run pacman -Syu, I get this error

[baran@archie ~]$ sudo pacman -Syu
:: Synchronizing package databases...
core                                                        625.8 KiB   854 KiB/s 00:01 [###################################################] 100%
extra                                                       625.8 KiB  1244 KiB/s 00:01 [###################################################] 100%
community                                                   625.8 KiB  1526 KiB/s 00:00 [###################################################] 100%
:: Starting full system upgrade...
error: could not open file /var/lib/pacman/sync/core.db: Unrecognized archive format
error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized archive format
error: could not open file /var/lib/pacman/sync/community.db: Unrecognized archive format
there is nothing to do

I inspected the database files with file and I saw that they are HTML files.

file output:

[baran@archie ~]$ file /var/lib/pacman/sync/core.db
/var/lib/pacman/sync/core.db: HTML document, ASCII text, with very long lines (65536), with no line terminators

If I open the database files, it's the same page with https://www.zerobounce.net/. I was able to do system upgrades yesterday. To troubleshoot I was connecting to the internet with a Huawei wifi extender. I connected to my phone's mobile internet with hotspot and run pacman -Syy. They are still HTML files. Is it about my configuration or has something happened to Pacman servers?

πŸ‘︎ 34
πŸ’¬︎
πŸ‘€︎ u/Tinasour
πŸ“…︎ Jul 21 2021
🚨︎ report
Amiga Format magazines removed from Archive.org

I uploaded some scans of Amiga Format magazine a few years ago to The Internet Archive. But recently I received an email from Archive.org saying they have had a copyright complaint from The Publishers Association about the materials uploaded by my user account to archive.org, and taken down the Amiga Format magazines I had uploaded.

When I went to look on Archive.org I couldn't spot any Amiga Format magazines still uploaded there only disk ISOs. Previously it was nearly a full collection of them on there. And other Amiga mags like CU Amiga are still available to download.

Does this mean Future Publishing have been going after places sharing copies of old magazines like AF? I know technically they are still under copyright but Future aren't going to make much money off 30 year old magazines any more. Even if they made them available as official digital downloads its still quite a small audience who would purchase.

πŸ‘︎ 32
πŸ’¬︎
πŸ‘€︎ u/mod81
πŸ“…︎ Jul 07 2021
🚨︎ report
PicView - Free and open source picture viewer with compact UI, that can be hidden. It can view images insides archives, as well as comic book formats. github.com/Ruben2776/PicV…
πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/reps_up
πŸ“…︎ Sep 08 2021
🚨︎ report
Arcade Hacks Archive in IPS format, contains hacks for various fighting games and works under FinalBurn Neo and Libretro. archive.org/details/SFIII…
πŸ‘︎ 81
πŸ’¬︎
πŸ‘€︎ u/IcyTable208X
πŸ“…︎ Jul 16 2021
🚨︎ report
Complete @officialmcafee Tweets archive in text format ia801004.us.archive.org/1…
πŸ‘︎ 18
πŸ’¬︎
πŸ‘€︎ u/SebQc77
πŸ“…︎ Sep 01 2021
🚨︎ report
Where can I buy the revised version of Stormlight Archives where the title of the tWoK is in horizontal format?
πŸ‘︎ 11
πŸ’¬︎
πŸ“…︎ Sep 15 2021
🚨︎ report
In Limited formats, opening/drafting one of the Mystical Archive cards that is banned in Historic will give you gems as if you already had 4x copies. FYI.

I opened my first Sealed. After clicking past the 6 rares it shows you once you open the packs, it showed me a second screen saying I got 20 gems for already having 4x of one of the rares I opened, which I knew couldn't be the case. One of my mystical archive cards was Swords to Plowshares. So I'm guessing it gives you gems if you open one of the cards that's banned in Historic. (only if you open in Limited formats)

πŸ‘︎ 191
πŸ’¬︎
πŸ‘€︎ u/GrantDayton
πŸ“…︎ Apr 15 2021
🚨︎ report
A list of permanent cards for Historic which would fit it's format vision and "balance" the powerful spells we got with Mystical Archives.

First, english is not my native language so I am sorry if I make some mistakes.

It's now been almost 2 months that Mystical Archives have changed the Historic format with powerful and well known spells from Magic's history. With them, combo, control and tempo decks have all gained a lot of steam. But sadly, midrange and agressive decks have been let down.

Following the vision of the Historic format to have well known cards added to it that have marked magic's history without being too powerful.

So I will propose a (not exhaustive) list of these cards that I think would be neat to have in Historic.

Green:

[[tarmogoyf]] seem a really good inclusion to the Historic card pool and would (I believe) not be broken because of the fetchlands and mishra's bauble not being available.

[[Tireless tracker]]

[[Hexdrinker]] would need to implant the level up mechanic to Arena though.

[[young wolf]]

Blue:

[[Delver of secrets]] would be good in Historic with brainstorm available but not broken due to the absence of free counterspells.

[[vendilion clique]] [[master of etherium]] [[spellstutter sprite]] [[jace, vryn's prodigy]] [[mausoleum wanderer]]

Red:

[[seasoned pyromancer]]

[[goblin guide]] wouldn't be too strong I believe, due to the lack of lightning bolt.

[[seal of fire]] [[alesha, who smiles at death]] [[bedlam reveler]]

White:

[[Stoneforge mystic]] with only embercleave and sword of body and mind, I think it could enable an interessant equipment focused archetype in historic.

[[flickerwisp]] [[gideon, ally of zendikar]] [[serra, the benevolent]] [[monastery mentor]] [[sigarda's aid]]

Black:

[[Kalitas, traitor of ghet]] [[liliana, the last hope]] [[gurmag angler]] [[changeling outcast]] [[blood-soaked champion]] [[liliana, heretical healer]] [[viscera seer]] [[dark confidant]] [[oath of liliana]]

Colorless:

[[Aether Vial]] would make sense with the vision of the format but might be too strong for it.

[[sword of light and shadow]] [[frogmite]] [[coretapper]] [[umezawa's jitte]] (It might end up way too strong for historic, i don't know) [[phyrexian altar]] [[decimator of the provinces]] [[emrakul, the promise end]] [[the chain veil]]

Multicolor:

[[tidehollow sculler]] [[vexing shusher]] [[Ice-fang coatl]] [[siege rhino]] [[grim flayer]] [[huntmaster

... keep reading on reddit ➑

πŸ‘︎ 11
πŸ’¬︎
πŸ‘€︎ u/MuramasaKoga
πŸ“…︎ Jun 15 2021
🚨︎ report
Grim Press - The Unbound Atlas Announcement!! New Map Archive & Now offering Universal VTT Map Format for our maps!!

Grim Press - The Unbound Atlas is thrilled to announce the following upgrades to our patreon.

Map Archive: The Unbound Atlas Archive is now live. Patrons will be able to access ALL the maps we have released so far and every map going forward in one single spot. This should make finding the map you want or need much easier going forward!!

New Variations: Starting with the next release a new variation will be available to Elite Supporters for all new maps. This variation is known as a "Universal VTT Format." This format contains both Lighting and Line of Sight data that is compatible with all major VTT programs.

As always please feel free to reach out below with comments or questions.

πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/DmDomination110
πŸ“…︎ Sep 17 2021
🚨︎ report
PicView - Free and open source picture viewer with compact UI, that can be hidden. It can view images insides archives, as well as comic book formats. github.com/Ruben2776/PicV…
πŸ‘︎ 9
πŸ’¬︎
πŸ‘€︎ u/reps_up
πŸ“…︎ Sep 08 2021
🚨︎ report
STACS - YARA based static credential scanner which supports binary file formats, and nested archives. github.com/stacscan/stacs
πŸ‘︎ 9
πŸ’¬︎
πŸ‘€︎ u/Darkarnium
πŸ“…︎ Aug 27 2021
🚨︎ report
When attempting to extract from a .tgz file on Windows 10, getting "Unrecognized archive format" error

For a project I recently began working on, I had to download large data files onto my computer via command line, which were compressed and saved as .tgz files.

The downloads were successful, but now that they are downloaded, I cannot figure out how to extract the files from inside. I checked, and they definitely saved as .tgz files (with a 5GB size so I know something is stored in there), but nothing is working to access the files themselves. I've tried doing this through programs such as WinZip and 7zip, but both give errors saying the file is "not a valid archive". I then tried to manually extract the files through command line, using "tar -xzvf <myfile>.tgz" and "tar -xzvf C:<myfilepath>\<myfile>.tgz -C C:<pathtofolder>", but the only return is "tar: Error opening archive: Unrecognized archive format".

To what I could gather by reading through articles on tar.gz files and some forums about others with similar problems, I know that the .tgz files might be able to be extracted using Linux. However, I have no experience whatsoever working with Linux and so would prefer to do solve this in Windows if possible.

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/Psn525
πŸ“…︎ Jul 17 2021
🚨︎ report
Simple archive format for unix.

Hey everybody!

This is my first library written in C. It's a simple yet fast way to manipulate sarf archives. I working on it a week or so and I'm really happy how it turned out and I thought that is the time to share with other people, so I can get some feedback.

It's actually a TAR clone but way simpler. I love all these unix utilities (like tar, vim, etc.) and I wanted to experiment on something similar.

This is the git repository: https://github.com/billvog/sarf

You are welcome to make pull requests or leave some issues!

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/cute-voyager
πŸ“…︎ Jul 18 2021
🚨︎ report
When attempting to extract from a .tgz file on Windows 10, getting "Unrecognized archive format" error /r/techsupport/comments/o…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Psn525
πŸ“…︎ Jul 17 2021
🚨︎ report
You know what format will benefit the most from Mystical Archive? Cube Draft!

The latest iteration of Arena Cube was powerful and fun, but it was still a far cry from my literally favourite Magic format: Vintage Cube.

With this huge influx of new and powerful instants and sorceries, the power level of the Arena Cube is going to skyrocket. Swords, DT, Dark Ritual, Mana Tithe, Primal Command, Time Warp, and many more. This should make for a great format.

Maybe a tiny fraction of the new cards in Mystical Archive will be played in Historic, but almost all of them will be fine inclusions for the Arena Cube.

πŸ‘︎ 82
πŸ’¬︎
πŸ‘€︎ u/atipongp
πŸ“…︎ Mar 26 2021
🚨︎ report
LaTeX formats archive

Hello everyone πŸ‘¨β€πŸ’»
Check out my archive of #LaTeX formats. There you can find several formats for reports, slides, conference papers and more.. πŸ“ƒπŸ“‰πŸ“ŠπŸ“˜
Link: https://github.com/dennishnf/latex-formats-archive

πŸ‘︎ 26
πŸ’¬︎
πŸ‘€︎ u/dennishnf
πŸ“…︎ May 20 2021
🚨︎ report
The Boglehead's Guide to Investing is now on Archive.org! Multi-format (Theory on long-term investing) archive.org/details/the-b…
πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/CuriousScholar24
πŸ“…︎ Jul 22 2021
🚨︎ report
Yet Another Archive Format Smuggling Malware trustwave.com/en-us/resou…
πŸ‘︎ 9
πŸ’¬︎
πŸ‘€︎ u/digicat
πŸ“…︎ Jun 28 2021
🚨︎ report

Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.