HTML/CSS How Is MS Word related to HTML and CSS?

  • Thread starter Thread starter christian0710
  • Start date Start date
  • Tags Tags
    Css Html
AI Thread Summary
MS Word can save documents as HTML, translating its internal formatting into an approximate HTML structure. This transformation is facilitated by the XML-based .docx file format, which allows for a relatively straightforward conversion to HTML. However, the HTML generated by MS Word is often complex and may not render correctly in web browsers, leading to potential issues when sharing documents. Users have reported that while MS Word's HTML files may work within Word itself, they can cause display problems in other browsers. Additionally, MS Word offers two HTML saving options: one that consolidates everything into a single file and another that organizes content into a subdirectory, with the latter being more standard. For those considering eBook formatting, it is suggested to use simpler HTML and CSS directly, as compatibility with platforms like Kindle may be limited when using Word-generated HTML.
christian0710
Messages
407
Reaction score
8
Just wondering: When i write a book in MS Word and save it as a webpage file (html) It seems that all the formatting i chose in MS Word (bulletons, indents, headings) perfectly translates to HTML and CSS, so just wondering Is HTML and CSS built into MS Word? Does MS Word work like a BROWSER, such that when i choose Heading 1, it's actually putting <h1> tags around the text and displaying it like a browser?
 
Technology news on Phys.org
No, MS Word knows how to transform its different internal format to (approximate) HTML.
 
  • Like
Likes christian0710
If you are using a modern version of Word (the one that saves files with a docx extension), the file format is based on XML (see https://en.wikipedia.org/wiki/Office_Open_XML). This means that there is a relatively small difference between the Microsoft format and HTML. You can see for yourself: Create a copy of a .docx file you want to inspect, change the extension to .zip and extract the contents.
 
  • Like
Likes christian0710
mfb said:
No, MS Word knows how to transform its different internal format to (approximate) HTML.
I
Thank you. Does this docx --> HTML engine that word introduced have a name?
 
Svein said:
If you are using a modern version of Word (the one that saves files with a docx extension), the file format is based on XML (see https://en.wikipedia.org/wiki/Office_Open_XML). This means that there is a relatively small difference between the Microsoft format and HTML. You can see for yourself: Create a copy of a .docx file you want to inspect, change the extension to .zip and extract the contents.

Wau that's a lot of files, thank you for the demonstration, so word can convert from Words coding language to XML, and today to HTML.
So what is the inherint/original coding language that word uses for formatting fx heading 1, body text etc?
 
christian0710 said:
Just wondering: When i write a book in MS Word and save it as a webpage file (html) It seems that all the formatting i chose in MS Word (bulletons, indents, headings) perfectly translates to HTML and CSS, so just wondering Is HTML and CSS built into MS Word? Does MS Word work like a BROWSER, such that when i choose Heading 1, it's actually putting <h1> tags around the text and displaying it like a browser?
A word of warning:
Although MS Word will save files in an HTML format, it makes very complicated HTML. In general, something like Heading 1 will not translate into a simple <h1>. In fact, nothing in those files is simple. Those saved HTML files work fine when opened in Word, but may not work well when opened in a browser. Even some versions of the Explorer browser. will not display it correctly. Just last month, I had to rewrite a saved MS Word HTML file. It worked fine on my browser, but after I distributed it, I found out that other people could not see the figures. I had to spent a couple of days editing the file by hand to simplify the HTML and make it usable.

PS. Word has 2 options for saving in HTML format. On saves everything in a single file and another saves many parts in a subdirectory. My impression is that the single-file option is not as standard as the option with the subdirectory.
 
Last edited:
  • Like
Likes christian0710
FactChecker said:
A word of warning:
Although MS Word will save files in an HTML format, it makes very complicated HTML. In general, something like Heading 1 will not translate into a simple <h1>. In fact, nothing in those files is simple. Those saved HTML files work fine when opened in Word, but may not work well when opened in a browser. Even some versions of the Explorer browser. will not display it correctly. Just last month, I had to rewrite a saved MS Word HTML file. It worked fine on my browser, but after I distributed it, I found out that other people could not see the figures. I had to spent a couple of days editing the file by hand to simplify the HTML and make it usable.
Do you know if kindle translates Word nicely from word? I did view a HTML word file and i see your point, maybe writing ebooks in HTML and CSS is the way to go, i just need to find out which subset of HTML and CSS is allowed on kindle devices.
 
christian0710 said:
Do you know if kindle translates Word nicely from word? I did view a HTML word file and i see your point, maybe writing ebooks in HTML and CSS is the way to go, i just need to find out which subset of HTML and CSS is allowed on kindle devices.
I have no experience with kindle. I seriously doubt that it would be able to display a saved Word HTML. I recommend that you use something where you can control the generated HTML and keep it as simple as possible. I have always ended up using a programmers editor like gvim to directly edit HTML, but I hope you can find something better.
 
  • Like
Likes christian0710
Word is notorious for format conversion problems. You may not know this, but Microsoft actually once had two different office suites: Office and Works. The dumbest thing? They couldn't understand each other's formats!
 
  • Like
Likes christian0710

Similar threads

Replies
1
Views
2K
Replies
5
Views
2K
Replies
4
Views
2K
Replies
3
Views
2K
Replies
4
Views
3K
Replies
6
Views
2K
Back
Top