The Wayback Machine - https://web.archive.org/web/20081007040956/http://www.webreference.com/html/tutorial29/2.html


spacer

Webref WebRef   Sitemap · Experts · Tools · Services · Newsletters · About i.com

home / experts / html / tutorials / 29 / 2

index1234567

HTTP for HTML Authors, Part II

Developer News
Microsoft Shows Some Ankle With Visual Studio
Gentoo Linux Cancels Distribution
It's Official: Windows 7 at PDC, WinHEC

Using Content-Type to specify character encoding

Web developers, being the lazy opportunists that we are, rarely care about the Content-Type header and generally assume that the server knows how to set it correctly. However, there is one case when configuring your server to send out a different value for the Content-Type header is critical, and that is when you're using a character encoding other than ISO-8859-1 to transport your documents over the Net.

I introduced you to character sets and character encodings back in Tutorial 17, and you should have a good understanding of them by now if you've written Web pages in anything except Western European languages.

You see, in addition to the type and subtype, MIME media types can have a number of optional parameters as well. In the case of the text/html media type, there is only one parameter that you might use, and it's called charset.

Now as you may well have gathered by now, the people who have taken it upon themselves to design and implement the technologies we use on the Web have a penchant for cruelty that makes the Marquis de Sade resemble the Easter Bunny on Valium. Thus, the designers of HTML decided that the charset parameter designates not, as you might have imagined, the character set of the document (as you may recall, the character set of HTML 4.0 documents is always supposed to be UCS, even though some versions of certain browsers to remain unnamed have differing opinions), but instead the character encoding of the document.

So, let's assume that I want to write a Web page that contains some text in Greek, and decide to encode it with the most commonly used encoding for this language, ISO-8859-7. When I send these documents to my readers through the magic of HTTP, I want them, or rather their browsers, to know this so that my alphas will be alphas and not left-facing accented squiggly flurbs or whatever else happens to be the imaginatively named equivalent of an alpha in ISO-8859-1. To achieve this, I need to set up my Web server to send a Content-Type header that looks like the following:

Content-Type: text/html;charset=iso-8859-7

Those of you with a keen eye for glaringly obvious facts will have gathered by now that you have to replace iso-8859-7 in the above line with the name of the encoding you're using, if it's not ISO-8859-7. You can get the canonical list of character sets (in this case used as encodings) from IANA.

How you achieve this depends on your Web server; consult the documentation that came with it or ask your hosting provider. Setting the charset parameter is all-important when producing properly internationalized pages, as it is the only way to guarantee that a browser that understands your chosen encoding will display your document correctly.

As mentioned in Tutorial 17, there are other ways to hint at the character encoding, but none of them work as well or are as widely supported as sending the right Content-Type HTTP header before the document.

Most Web servers decide which media type to assign to a document based on its filename extension (the .html bit in a file called index.html is the extension). If this is the case with your server, you have two options: you can configure your server to send the modified media type, including the charset parameter, for all files ending in .html; this is the ideal solution if you want to use this encoding for all of your HTML files. Alternatively, if you want a mix of encodings, you could store your ISO-8859-7-encoded documents in files ending in, for instance, .html-el and configure your Web server to send the modified Content-Type header for those files only.

Once again, the details of this procedure depend on your particular setup; Macintoshes mostly ignore filename extensions and rely on resource forks instead for identifying file types, while some Web servers can even look inside a file and figure out its media type by examining the content. Obviously, if your documents are generated on the fly by something like a CGI program or a Java Servlet, you'll have to write this program so that it sends out the correct Content-Type header.

index1234567

Next Page...

http://www.internet.com/



JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info

Copyright 2008 Jupitermedia Corporation All Rights Reserved.
Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
IBM Whitepaper: Innovative Collaboration to Advance Your Business
Internet.com eBook: Real Life Rails
Avaya Article: Call Control XML - Powerful, Standards-Based Call Control
Internet.com eBook: The Pros and Cons of Outsourcing
Go Parallel Article: Scalable Parallelism with Intel(R) Threading Building Blocks
Internet.com eBook: Best Practices for Developing a Web Site
IBM CXO Whitepaper: The 2008 Global CEO Study "The Enterprise of the Future"
Avaya Article: Call Control XML in Action - A CCXML Auto Attendant
Go Parallel Article: James Reinders on the Intel Parallel Studio Beta Program
IBM CXO Whitepaper: Unlocking the DNA of the Adaptable Workforce--The Global Human Capital Study 2008
Adobe Acrobat Connect Pro: Web Conferencing and eLearning Whitepapers
Go Parallel Article: Getting Started with TBB on Windows
HP eBook: Storage Networking , Part 1
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Go Parallel Video: Intel(R) Threading Building Blocks: A New Method for Threading in C++
HP Video: Is Your Data Center Ready for a Real World Disaster?
Microsoft Partner Portal Video: Microsoft Gold Certified Partners Build Successful Practices
HP On Demand Webcast: Virtualization in Action
Go Parallel Video: Performance and Threading Tools for Game Developers
Rackspace Hosting Center: Customer Videos
Intel vPro Developer Virtual Bootcamp
HP Disaster-Proof Solutions eSeminar
HP On Demand Webcast: Discover the Benefits of Virtualization
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Microsoft Download: Silverlight 2 Software Development Kit Beta 2
30-Day Trial: SPAMfighter Exchange Module
Red Gate Download: SQL Toolbelt
Iron Speed Designer Application Generator
Microsoft Download: Silverlight 2 Beta 2 Runtime
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
IBM IT Innovation Article: Green Servers Provide a Competitive Advantage
Microsoft Article: Expression Web 2 for PHP Developers--Simplify Your PHP Applications
Featured Algorithm: Intel Threading Building Blocks - parallel_reduce
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES
webref The latest from WebReference.com Browse >
Controllers: Programming Application Logic - Part 2 · How to Use JavaScript to Validate Form Data · Controllers: Programming Application Logic
Sitemap · Experts · Tools · Services · Email a Colleague · Contact FREE Newsletters 
 The latest from internet.com
Sprint Launches Mobile WiMAX Network · Albatron Downsizes with the KI780G Mini-ITX Motherboard · Can't Find a Wi-Fi Network? Make Your Own.

URL: http://www.webreference.com/html/tutorial29/2.html

Produced by Stephanos Piperoglou
Created: January 24, 2001
Revised: February 27, 2001

Morty Proxy This is a proxified and sanitized view of the page, visit original site.