How Is MS Word related to HTML and CSS?

  • #1
christian0710
409
9
Just wondering: When i write a book in MS Word and save it as a webpage file (html) It seems that all the formatting i chose in MS Word (bulletons, indents, headings) perfectly translates to HTML and CSS, so just wondering Is HTML and CSS built into MS Word? Does MS Word work like a BROWSER, such that when i choose Heading 1, it's actually putting <h1> tags around the text and displaying it like a browser?
 

Answers and Replies

  • #2
36,295
13,370
No, MS Word knows how to transform its different internal format to (approximate) HTML.
 
  • Like
Likes christian0710
  • #3
Svein
Science Advisor
Insights Author
2,274
785
If you are using a modern version of Word (the one that saves files with a docx extension), the file format is based on XML (see https://en.wikipedia.org/wiki/Office_Open_XML). This means that there is a relatively small difference between the Microsoft format and HTML. You can see for yourself: Create a copy of a .docx file you want to inspect, change the extension to .zip and extract the contents.
 
  • Like
Likes christian0710
  • #4
christian0710
409
9
No, MS Word knows how to transform its different internal format to (approximate) HTML.
I
Thank you. Does this docx --> HTML engine that word introduced have a name?
 
  • #5
christian0710
409
9
If you are using a modern version of Word (the one that saves files with a docx extension), the file format is based on XML (see https://en.wikipedia.org/wiki/Office_Open_XML). This means that there is a relatively small difference between the Microsoft format and HTML. You can see for yourself: Create a copy of a .docx file you want to inspect, change the extension to .zip and extract the contents.

Wau that's a lot of files, thank you for the demonstration, so word can convert from Words coding language to XML, and today to HTML.
So what is the inherint/original coding language that word uses for formatting fx heading 1, body text etc?
 
  • #6
FactChecker
Science Advisor
Homework Helper
Gold Member
7,726
3,392
Just wondering: When i write a book in MS Word and save it as a webpage file (html) It seems that all the formatting i chose in MS Word (bulletons, indents, headings) perfectly translates to HTML and CSS, so just wondering Is HTML and CSS built into MS Word? Does MS Word work like a BROWSER, such that when i choose Heading 1, it's actually putting <h1> tags around the text and displaying it like a browser?
A word of warning:
Although MS Word will save files in an HTML format, it makes very complicated HTML. In general, something like Heading 1 will not translate into a simple <h1>. In fact, nothing in those files is simple. Those saved HTML files work fine when opened in Word, but may not work well when opened in a browser. Even some versions of the Explorer browser. will not display it correctly. Just last month, I had to rewrite a saved MS Word HTML file. It worked fine on my browser, but after I distributed it, I found out that other people could not see the figures. I had to spent a couple of days editing the file by hand to simplify the HTML and make it usable.

PS. Word has 2 options for saving in HTML format. On saves everything in a single file and another saves many parts in a subdirectory. My impression is that the single-file option is not as standard as the option with the subdirectory.
 
Last edited:
  • Like
Likes christian0710
  • #7
christian0710
409
9
A word of warning:
Although MS Word will save files in an HTML format, it makes very complicated HTML. In general, something like Heading 1 will not translate into a simple <h1>. In fact, nothing in those files is simple. Those saved HTML files work fine when opened in Word, but may not work well when opened in a browser. Even some versions of the Explorer browser. will not display it correctly. Just last month, I had to rewrite a saved MS Word HTML file. It worked fine on my browser, but after I distributed it, I found out that other people could not see the figures. I had to spent a couple of days editing the file by hand to simplify the HTML and make it usable.
Do you know if kindle translates Word nicely from word? I did view a HTML word file and i see your point, maybe writing ebooks in HTML and CSS is the way to go, i just need to find out which subset of HTML and CSS is allowed on kindle devices.
 
  • #8
FactChecker
Science Advisor
Homework Helper
Gold Member
7,726
3,392
Do you know if kindle translates Word nicely from word? I did view a HTML word file and i see your point, maybe writing ebooks in HTML and CSS is the way to go, i just need to find out which subset of HTML and CSS is allowed on kindle devices.
I have no experience with kindle. I seriously doubt that it would be able to display a saved Word HTML. I recommend that you use something where you can control the generated HTML and keep it as simple as possible. I have always ended up using a programmers editor like gvim to directly edit HTML, but I hope you can find something better.
 
  • Like
Likes christian0710
  • #9
newjerseyrunner
1,535
637
Word is notorious for format conversion problems. You may not know this, but Microsoft actually once had two different office suites: Office and Works. The dumbest thing? They couldn't understand each other's formats!
 
  • Like
Likes christian0710
Top