Downloading the Internet

Zetta_x · Oct 25, 2010

Lets say theoretically we had a big enough harddrive to store everything on the internet.

Lets also say we had a fast enough internet connection that this could be done in a day. Would it be possible to download the entire internet and browse every webpage offline?

BloodyFlame · Oct 25, 2010

Zetta_x said:
Lets say theoretically we had a big enough harddrive to store everything on the internet.

Lets also say we had a fast enough internet connection that this could be done in a day. Would it be possible to download the entire internet and browse every webpage offline?

Terrabytes of pr0nz.

omgpwn666 · Oct 25, 2010

That would be about 40k terabytes(360,000,000 gigs?? O.O), I would guess.

mrSmiles · Oct 25, 2010

Zetta_x said:
Lets say theoretically we had a big enough harddrive to store everything on the internet.

Lets also say we had a fast enough internet connection that this could be done in a day. Would it be possible to download the entire internet and browse every webpage offline?

if we theoretically had a big enough harddrive and a fast enough internet connection then theoretically we can download and browse everything on the net.

question answers itself.

Zetta_x · Oct 25, 2010

Ok, how about this. Since you guys clearly agree it is possible to do this.

How hard and how efficient would it be to create a web browser or firefox extension that checks certain site directories every x minutes and download any updates.

What I want to create is some extension that lets a user define a certain subdirectory of a website (or the website itself), uses an active internet connection, and continually downloads any new material for 3 modes.

1 - Disabled: A normal web browser today.
2 - Hybrid Mode: When your internet connection drops, loads last stored data webpage.
3 - Offline Mode: Loads last data webpage regardless of internet connection.

monkat · Oct 25, 2010

Without access to every server's protected information?

Near infinitely difficult.

If everything were just like your hard drive, then it wouldn't be too hard, assuming the computer was large / powerful enough to handle billions of internet search queries per amount of time.

I don't see why you're asking, though, it's not going to happen. Ever.

Zetta_x · Oct 25, 2010

Read my last post, it's fairly evident.

QUOTE said:
I don't see why you're asking, though, it's not going to happen. Ever.

Just like gamecube USB loading?

This type of attitude is the failure of life. In a world with millions of people, in order to achieve something no one else has, you have to be different and think outside the box.

I wouldn't see your attitude as a negative thing, because it's comments like yours that gives people motivation to rub it in the face when it is done.

for separation of post

How many people have used a spoiler tag to separate between an edit? Clearly, the answer was yes, it is possible but not probable to download the entire internet and run it offline. But why would I ask, obviously it's not going to happen... oh yeah, and to even add extra emphasis... ever. The reason why I asked is because I have a laptop that I constantly use for travel. Imagine downloading even a small portion of wiki and using it as reference when I am in an area where I can't travel. What if I developed a search engine that goes through a site and downloads any pages related to microbiology. When these pages are found, I can use an active internet connection to download the information and use it at school in an offline environment?

Maybe at home, where my internet connection drops often. When it drops connection, I often have to wait 5-10 minutes to wait until it restores, sometimes even restarting the router. What if I made an extension where I can still view the webpage up to date in the last 5 minutes. I can continue to do research during this downtime...

If downloading the whole internet is possible, then by mathematical properties, it is possible to download any subset of the internet. Maybe one page or four, as long as you have the resources and if everyone who has replied is right, then it would be possible.

redact · Oct 25, 2010

i don't see how this would be possible unless the entire internet were consisted of static pages

dynamic pages (such as the one you are currently viewing) make your dreams of owning the internet impossible, not just implausible

Rydian · Oct 25, 2010

http://en.wikipedia.org/wiki/Wget
This can download recursively, meaning it'll follow links and download them as well and redirect the links to the local copies when you view and shit.

So yes, you could use wget to download the entire internet, time and space permitting.

However none of it would be interactive (outside of ajax/JS/flash) offline.

mysticwaterfall · Oct 25, 2010

The only way such a system would ever work is if you set it up on a site by site basis and only had it do a small number of websites. A number of programs used to actually do this back in the day, they would cache the links on a website and then you could read them later, offline. The main point for it then was there was no broadband and a lot of people paying by the hour for internet. And it could only handle small numbers of pages at a time.

Of course, the difference is, back then websites were a lot different and there was little dynamic content. Even if it was practical to cache more then a few websites at a time now, they would be out of date almost instantly.

EDIT: Reading your edit, there certainly are offline dumps of wikipedia and the like, but you would still only be able to do it by specific webpages at a time and even then could only spider the links there. Having it amass everything on something like "Microbiology" constantly in the background while you work would be insanely impractical. There's a reason Google has massive server farms that do nothing but spider the web all day.

Zetta_x · Oct 25, 2010

Dynamic pages, stuff like java and flash, I know would be a limitation, maybe even a few work arounds but the majority of this stuff, I agree, would not be able to be obtained and ran offline. Some flash programs serve as a medium to load stuff off a server. Take Flash Flash Revolution for an example, when the site went down last year, it is possible to download all 2gigs of the engine, songs, charts, and play it offline.

However, there are some things like game faq pages, wiki pages, and other various applications where I can see it benefiting me especially in some places of my internet connection where dropped internet connections wouldn't be a burden anymore.

---

It would be possible to create an AHK script to copy entire dynamic webpages and store it in a file. But, how would the AHK script know what to look for, this is where ideas could come into place.

I create an autohotkey script, along with a thunderbird application, so I can send a text message to a certain email in a very specific format. The autohot key script would run some macros to insert an address I put into google maps, and send the directions to my cell.

QUOTE said:
EDIT: Reading your edit, there certainly are offline dumps of wikipedia and the like, but you would still only be able to do it by specific webpages at a time and even then could only spider the links there. Having it amass everything on something like "Microbiology" constantly in the background while you work would be insanely impractical. There's a reason Google has massive server farms that do nothing but spider the web all day.

Impractical today in some examples, it still has practical applications. It was mainly used as an example vs trying to establish a detail.

It would work like a giant RSS feed, you list which websites to feed off of, and it checks it and maybe compares it.

Thanks Rydian for that link, that is pretty much what I was looking for. Something similar to the but with combinations of a nice GUI, active transfer details, and an implementation with a browser.

Midna · Oct 25, 2010

Maybe if you had a super powerful alien supercomputer that could hack all the servers.

Rydian · Oct 25, 2010

DownThemAll is an addon for firefox that can be used to save linked/embedded content on a page and crap, but it's not nearly as good as wget.

There's no tool to do what you want from within a browser because it doesn't make any sense and nobody's ever had to do it for a job or whatever.

Check out wget's arguments and you might be surprised what can be done with it. Also write scripts that call it.

playallday · Oct 25, 2010

You would need a hard drive >5 exabytes.

mameks · Oct 25, 2010

this hard drive would be f'king massive. As in huge building/small city massive...we had a presentation made/given to us at school about it.

Hells Malice · Oct 25, 2010

You'd need some pretty fine anti-virus software.

mcp2 · Oct 25, 2010

Everything on the internet is on hard drive, just loads of them! Would be easier to gather all the world's servers.

pikachu945 · Oct 25, 2010

the internet itself could be more then 700 yottabytes I think it was 500 in 2009

playallday · Oct 25, 2010

pikachu945 said:
the internet itself could be more then 700 yottabytes I think it was 500 in 2009
WikiAs of 2010, no storage system has achieved one zettabyte of information. The combined space of all computer hard drives in the world does not amount to even one yottabyte, but was estimated at approximately 160 exabytes in 2006. As of 2009, the entire Internet was estimated to contain close to 500 exabytes.

pikachu945 · Oct 25, 2010

Arctic said:
pikachu945 said:

the internet itself could be more then 700 yottabytes I think it was 500 in 2009
WikiAs of 2010, no storage system has achieved one zettabyte of information. The combined space of all computer hard drives in the world does not amount to even one yottabyte, but was estimated at approximately 160 exabytes in 2006. As of 2009, the entire Internet was estimated to contain close to 500 exabytes.

Click to expand...

lol it was a joke and I got told

Downloading the Internet

The Insane Statistician

Well-Known Member

Guy gamer and proud!

Dundunduuuun

The Insane Statistician

I'd like to see you TRY to ban me. (Should I try?.

The Insane Statistician

‮҉

Resident Furvert™

Streamforce Supreme Commander

The Insane Statistician

Banned!

Resident Furvert™

Group: GBAtemp Ghost

in memoriam of gravitas

Are you a bully?

Well-Known Member

Well-Known Member

Group: GBAtemp Ghost

Well-Known Member

Similar threads

Popular threads in this forum