How The Internet Works
Internet Basics
Well for starters if your reading this you have somewhat of an idea of how to get on the Internet,
and how to get or find certain sites. This image gives a good overview of how it works in
general...
Actually the Internet is made up of more then just the WWW ( World Wide Web ),
as the diagram depicts just that. News, FTP ( File Transfer Protocol), and the WWW are just a few
sections of the Internet. So how does it all work anyway?
The Internet is basically a vast network of personal computers, servers, and all types of machines
communicating over a common protocol, the I.P., or Internet Protocol. Each computer or in general,
each device connected to the Internet, or any network for that matter, needs an I.P. address unique
to that device, but we will get into that more a bit later. Look at the Internet as a huge bookshelf,
with millions of different books of all types. Each book is a different machine, containing different
information. So how do you even get on the Internet in the first place? Lets take a look at the
different ways to connect to the Internet.
How the Internet connection works
You need to connect to the Internet somehow to receive and transmit data. How you connect can depend
on where you live ( you may have limited choices ) or your budget. A majority of home-based Internet
users connect via a dial-up connection, using a modem that's inside ( or in some cases outside ) your
computer. A modem is a device that converts analog signals from the PTN ( public telephone network )
into digital signals inside your machine. Hence what modem actually stands for
( MODulation and DEModulation ). Dial-up modem users usually pay an average of about 15-$20 dollars
per month for usage via and Internet Service Provider ( ISP ) that provides them the means of connecting,
viewing web-based media and sending and receiving e-mail. xDSL ( Digital Subscriber Line ) also uses
the existing copper wire of your phone lines to connect you, but provides much greater bandwidth using
special methods such as a distance limited digital line.
As you see from this diagram, the user connects to the local phone switch, then over the PTN to the
local carrier or another CLEC ( Competitive Local Exchange Carrier ) then to the ISP's equipment, hitting
authentication, then logging in to the network. Of course if your using cable or wireless access to get
on the Internet, your diagram would look different from that of the one above. If you were using wireless,
your bandwidth would come from a local antenna or satellite dish, connecting you to your Internet Service
Provider. If you were using cable, you would connect via your local cable company ( and then probably an
ISP they outsource too ).
Viewing web pages
It would probably be silly for me to explain how to view web pages here, since you are obviously
looking at this page reading this in one form or another. Viewing web pages is handled by an HTTP request,
or HyperText Transfer Protocol. Basically your web browser, such as Internet Explorer or Netscape
Navigator, sends a request for the document you select to the server that the document resides on,
the server looks up the page, and sends the document to you, text, multimedia content and all. Your
browser interprets the format and displays it on your screen. Depending on the document, you may view,
save, or execute the file or document or the multimedia content.
How Electronic Mail works ( E-mail )
Checking for e-mail is similar to viewing web pages, although the process uses different software and
protocols. Using your mail software ( Such as Outlook, Outlook Express, Netscape Messenger, Eudora, Pegasus,
Pine, etc. ) your computer checks for mail using the Post Office Protocol ( POP ) and then sends mail using
the SMTP ( Simple Mail Transfer Protocol ). You specify the servers under the software configuration,
including the outgoing and incoming mail servers, and also your authentication information such as your
username and password. E-mail that is addressed to you is processed by your Internet Service Provider
( ISP ) and is stored into a queue, and awaits for you to retrieve it. The mail server authenticates
you, and then will allow you to receive mail that is awaiting in your Que. Sendmail servers do not
usually require authentication, but most sendmail servers use and I.P. lookup table. That is, if your
not a user of that particular system, and/or not logged onto that network related to that sendmail server,
the server will not allow you to send messages. Outgoing mail servers ( mail servers that send e-mail OUT
of your network, are called open relay's if they allow outbound e-mail traffic from users of other networks.
Unfortunately there are thousands of open relay's on the Internet, and those that send junk e-mail ( SPAM ),
use these servers to their advantage.
Internet Protocol and the Domain Name System
So how does the Internet know who 'www.icehousedesigns.com' for example? How does it know what machine
out of millions that name is assigned to?
Enter the magic of I.P. addressing, or Internet Protocol. Each device is assigned an I.P. address on the
Internet such as '207.219.345.212'. Certain machines are also assigned 'domain names' such as web servers.
For example. Yahoo! can be reached by typing in http://www.yahoo.com in your browser window, or also by
typing 64.58.76.227, which is the I.P. address of Yahoo's website. So who knows what I.P. is assigned to
where? All domain names are handled by a company called Network Solutions, who are responsible for maintaining
the whois database. No two names can be alike. The database is updated generally once or twice a day. There
are also name servers from around the world and thousands of ISP's that update their name servers to match
that of the whois database, generally once daily during off-peak hours. After all, it would be way to slow
if we had only one database to use having millions of users polling it for information, and it is a lot
easier to remember a simple name then a number right? Name servers are also called DNS servers, or Domain
Name Servers. So when the average user types in a web server to visit, the DNS servers looks up the name
for a match in the database which it updates from the whois database, once it finds a match it resolved
the name to an I.P. Address, and it connects you to the appropriate machine.
Online shopping and security
One of the buzz words of the Internet have been 'e-commerce', or Electronic Commerce. E-commerce basically
means purchasing goods or services over the Internet, but just how secure is giving out your personal
information over the net anyway? Generally speaking, with a well-established company such as E-bay, whom
have developed relationships with millions of users, security is very tight. All personal information,
credit card numbers, etc., are all sent to a secure server, and encrypted usually with a 128-bit encryption
strength, which is virtually impossible to crack or intercept. However, with anything, it is not 100%.
Just be wary about giving your personal information out online, and make sure if you do it is through a
secure connection.
All about Junk E-mail ( SPAM )
I've been doing some research on a hot topic that's been around since the early days of the web, It's
called mail harvesting, and it's the number 1 way an e-mail address gets collected and used for unsolicited
e-mail. This will help you all in probably answering a common question, "Why do I get so much spam?"
The answer all comes down to mail harvesting programs. Now I could rant on for hours about this, but I'm
going to try to sum it all up as much as possible.
Today on the Internet lies many marketing companies that use mailing lists that send out letters to addresses.
Some companies use opt-in lists ( meaning you have to add your address to the list, you SUBSCRIBE ) but many
more use generated lists, this is where the unsolicited part comes in. There are literally hundreds of 'mail
harvesting' programs out there that can do 1 or more of the following things:
A. Follow links for keyword searches through search engines to sites, harvesting e-mail addresses from pages.
For example, if a marketing company wanted to build an e-mail database of web designers, all they would have
to do is enter "web design services" for example, into the harvesting program, and it will follow the returned
results from engines to the specific sites and harvest the addresses from there.
B. Visit a domain all in itself, pulling e-mail addresses from pages.
This includes USENET groups, discussion forums ( which are commonly hit ), and ISP's.
There are several ways you can protect your e-mail address from harvesting programs....
1. Don't post your e-mail address in any USENET groups. As far as message forums go, Don't put your
e-mail address in your message body. If the forum has a profile for its users, you shouldn't enter your
e-mail address in there either as it is still subject to harvesting. The most advanced harvesters are
looking for preprogrammed patterns in the most popular message boards such as the UBB.
2. A good Harvester will only follow CGI links if they find a certain pattern. Most amateur spammers
still use programs that specify how man directories deep they go into a site. From my results using
http://domain.com/a/b/c/d/e/f/email.html ( for example ) will cut down on a lot of spam. If you have
a cgi-bin use an e-mail form which stores the e-mail in the cgi code ( or PHP ). Also just to be safe
change the name of the cgi mail to something different...other then using the word 'mail'.
3. If you are entering your e-mail address on a web page, use this sample javascript to help stop
harvesting programs:
<script
language="JavaScript">firsthalf="username"secondhalf=
"@domain.com"document.w
rite("<a href=\"mailto:" + firsthalf
+ secondhalf + "\">Email
Me</a>")</script>
|
Substitute the data for variable "firsthalf" with your username and the data for variable "secondhalf"
with your actual domain for the e-mail address. Most web browsers are JavaScript enabled so that shouldn't
be much of a problem and basically safe to use as a simple solution. However one of the problems with this
is that harvesting programs will attempt to put together any combination of phrases that even look like an
e-mail address, but this would help.
Blocking user agents from domains with several well-known harvesting programs remains to be an option,
but as I stated before there are far to many to block them all. Many other methods also embed tons of fake
e-mail addresses into their web pages, in hopes the harvester will pick them up, resulting in a lot of
bounced mail.
The pro spammers are now using reverse DNS lookup when compiling and sorting their lists. In the past its
easy to generate false domains and e-mail addresses for spammers to suck up. Now its easy for them to sort
the lists if the domain isn't a current one (i.e. false) its eliminated, therefore most of the anti-spam
programs are ineffective. I've heard there is a way around this but its still being tested out.
I would recommend to those with web space to put up some sort of script on your index page that will allow
you to track the user-agent of visitors. Here is a brief list of the user-agents of more popular harvesting
programs:
EmailSiphon
EmailWolf
ExtractorPro
Mozilla.*NEWT
Crescent
CherryPicker
WebBandit
NICErsPRO
EmailCollector
Just because a user-agent with a harvester program visited your page, doesn't mean your address was
successfully harvested.
<< Previous Beginners Article |
Next Beginners Article >>
Beginners Main Menu 1 to 7 |
Beginners Main Menu 8 to 14
Beginners Main Menu 15 to 21
|