How Is MS Word related to HTML and CSS?

  • Context: HTML/CSS 
  • Thread starter Thread starter christian0710
  • Start date Start date
  • Tags Tags
    Css Html
Click For Summary

Discussion Overview

The discussion revolves around the relationship between MS Word and HTML/CSS, particularly focusing on how formatting in Word translates to web formats when saving documents as HTML. Participants explore the technical aspects of this conversion, the complexities involved, and the implications for web compatibility.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Experimental/applied

Main Points Raised

  • Some participants wonder if MS Word incorporates HTML and CSS directly, suggesting that formatting choices like Heading 1 might correspond to actual HTML tags.
  • Others clarify that MS Word transforms its internal format to an approximate HTML, rather than embedding HTML/CSS directly.
  • It is noted that the modern .docx format is based on XML, which shares similarities with HTML, leading to a relatively straightforward conversion process.
  • Concerns are raised about the complexity of the HTML generated by MS Word, with some participants stating that it does not translate into simple HTML elements like

    , and may not display correctly in web browsers.

  • One participant shares an experience of having to manually edit a saved HTML file from Word to make it usable for distribution, highlighting potential issues with compatibility across different platforms.
  • Questions arise regarding the compatibility of Word-generated HTML with Kindle devices, with some skepticism expressed about the ability of Kindle to properly display such files.
  • Another participant mentions the historical context of format conversion issues within Microsoft Office products, indicating a broader concern about compatibility across different formats.

Areas of Agreement / Disagreement

Participants express a mix of agreement and disagreement regarding the capabilities and limitations of MS Word in converting documents to HTML. While some acknowledge the transformation process, others emphasize the complications and potential issues that arise from the generated HTML.

Contextual Notes

Limitations include the complexity of the HTML produced by MS Word, which may not conform to standard web practices, and the potential for compatibility issues when viewed in different browsers or devices.

Who May Find This Useful

This discussion may be useful for users of MS Word who are interested in web publishing, developers looking to understand document conversion processes, and those exploring eBook formatting for devices like Kindle.

christian0710
Messages
407
Reaction score
8
Just wondering: When i write a book in MS Word and save it as a webpage file (html) It seems that all the formatting i chose in MS Word (bulletons, indents, headings) perfectly translates to HTML and CSS, so just wondering Is HTML and CSS built into MS Word? Does MS Word work like a BROWSER, such that when i choose Heading 1, it's actually putting <h1> tags around the text and displaying it like a browser?
 
Technology news on Phys.org
No, MS Word knows how to transform its different internal format to (approximate) HTML.
 
  • Like
Likes   Reactions: christian0710
If you are using a modern version of Word (the one that saves files with a docx extension), the file format is based on XML (see https://en.wikipedia.org/wiki/Office_Open_XML). This means that there is a relatively small difference between the Microsoft format and HTML. You can see for yourself: Create a copy of a .docx file you want to inspect, change the extension to .zip and extract the contents.
 
  • Like
Likes   Reactions: christian0710
mfb said:
No, MS Word knows how to transform its different internal format to (approximate) HTML.
I
Thank you. Does this docx --> HTML engine that word introduced have a name?
 
Svein said:
If you are using a modern version of Word (the one that saves files with a docx extension), the file format is based on XML (see https://en.wikipedia.org/wiki/Office_Open_XML). This means that there is a relatively small difference between the Microsoft format and HTML. You can see for yourself: Create a copy of a .docx file you want to inspect, change the extension to .zip and extract the contents.

Wau that's a lot of files, thank you for the demonstration, so word can convert from Words coding language to XML, and today to HTML.
So what is the inherint/original coding language that word uses for formatting fx heading 1, body text etc?
 
christian0710 said:
Just wondering: When i write a book in MS Word and save it as a webpage file (html) It seems that all the formatting i chose in MS Word (bulletons, indents, headings) perfectly translates to HTML and CSS, so just wondering Is HTML and CSS built into MS Word? Does MS Word work like a BROWSER, such that when i choose Heading 1, it's actually putting <h1> tags around the text and displaying it like a browser?
A word of warning:
Although MS Word will save files in an HTML format, it makes very complicated HTML. In general, something like Heading 1 will not translate into a simple <h1>. In fact, nothing in those files is simple. Those saved HTML files work fine when opened in Word, but may not work well when opened in a browser. Even some versions of the Explorer browser. will not display it correctly. Just last month, I had to rewrite a saved MS Word HTML file. It worked fine on my browser, but after I distributed it, I found out that other people could not see the figures. I had to spent a couple of days editing the file by hand to simplify the HTML and make it usable.

PS. Word has 2 options for saving in HTML format. On saves everything in a single file and another saves many parts in a subdirectory. My impression is that the single-file option is not as standard as the option with the subdirectory.
 
Last edited:
  • Like
Likes   Reactions: christian0710
FactChecker said:
A word of warning:
Although MS Word will save files in an HTML format, it makes very complicated HTML. In general, something like Heading 1 will not translate into a simple <h1>. In fact, nothing in those files is simple. Those saved HTML files work fine when opened in Word, but may not work well when opened in a browser. Even some versions of the Explorer browser. will not display it correctly. Just last month, I had to rewrite a saved MS Word HTML file. It worked fine on my browser, but after I distributed it, I found out that other people could not see the figures. I had to spent a couple of days editing the file by hand to simplify the HTML and make it usable.
Do you know if kindle translates Word nicely from word? I did view a HTML word file and i see your point, maybe writing ebooks in HTML and CSS is the way to go, i just need to find out which subset of HTML and CSS is allowed on kindle devices.
 
christian0710 said:
Do you know if kindle translates Word nicely from word? I did view a HTML word file and i see your point, maybe writing ebooks in HTML and CSS is the way to go, i just need to find out which subset of HTML and CSS is allowed on kindle devices.
I have no experience with kindle. I seriously doubt that it would be able to display a saved Word HTML. I recommend that you use something where you can control the generated HTML and keep it as simple as possible. I have always ended up using a programmers editor like gvim to directly edit HTML, but I hope you can find something better.
 
  • Like
Likes   Reactions: christian0710
Word is notorious for format conversion problems. You may not know this, but Microsoft actually once had two different office suites: Office and Works. The dumbest thing? They couldn't understand each other's formats!
 
  • Like
Likes   Reactions: christian0710

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
10
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 6 ·
Replies
6
Views
7K