Digital Content Quality Assurance Service

Context-based Spell Check | Data Mining | SEO

Home
News
Samples
Guide
Testimonials
Contact Us
Site Map
Legal
FAQs
Privacy Policy
Ordering
Free Tools
Need to proof your large business or educational website for typographical errors?
Need to expend a minimal amount of time and effort?
Have a website with complex terminology not found in many dictionaries?

This service specializes in handling such content.

Media content such as newspapers, magazines, blogs, forums, product and service documentation as well as  large books  can also be processed by this service.  This service produces  results  when  others  produce  hundreds  of unrecognized  words  that  should  be in their  dictionaries  or  identified as correct through  context based spell checking.


Identify emerging trends and top VIPs in blogs and forums using the content topic phrase data mining feature provided at no additional charge.


Did  you  know  that  mistyped  words on your  website  impact  your  Internet search engine ranking on sites such as  Google, Bing and Yahoo Search?  Many  business websites  spend  hundreds of  dollars  on  advanced SEO  tweaks in  an  attempt  to improve  their search  ranking,  but  fail  to  ensure  their  website  content  is  free  of  typographical mistakes.  A reduced  search engine ranking  impacts  the  number of visitors  to your website and thus impacts your bottom line.

Typos impact your organization's reputation and credibility.  Many websites state that user trust is a major consideration, but at the same time are littered with typos.

Typos    can    cause   a    loss   in   business   productivity  due  to  the  time   spent  providing  clarification  to  clients  and  internally  within your organization.   You may think that your documents  are  error-free,  but  you would be  surprised to learn how many issues can be  identified  on  most  large  websites.  To prove this point, go to the "Samples" web page on this website to see an example of content issues found  by  this  service at  the "U.S. Government Printing Office" (GPO),  website   and    http://www.whitehouse.gov.  Even the  best managed  and  maintained websites  typically have between twenty to thirty issues.


So  why do so many  websites  have mistakes even after using a spell check tool during  web page creation and  editing?   Usually the  dictionaries used by  these  tools are not very robust.  You usually  find yourself adding large segments of new unverified    words   to   your    local   dictionaries    once   you    incorporate   your   specialized  business  and   technical   terminology.  Many  may   not   be verified because  it is  such  a  time  consuming  process  to  check  each  word.   Rushed deadlines and tired employees are prone to mistakes.  Also, your spell check tool  may not have grammar check  functionality.


This  service  also   provides  fundamental  SEO  quality   assurance  checks for  your website; creating a one-stop service for all your fundamental SEO needs. 

This means there is no need to use a separate spell check and SEO service.


The  dictionaries  used by this service  are  constantly being updated  with all  the latest words and terminology.  It simply isn't cost effective to try and  duplicate the capabilities of this service.  It's capabilities  have been  evolving  for a number of years and is well trained from processing hundreds of websites.



Core Service Features

This service identifies:

  • Spelling errors in document bodies (example)
  • Spelling errors in HTML tags (title, desc., keywords) (example)
  • Word mis-capitalizations of proper nouns (example)
  • Double typed words such as "the the"  (example) (example)
  • Unpopulated HTML tags affecting your Internet search engine ranking
  • Misuse of the articles of speech "a" and "an" (grammar check)
  • Documents containing profanity.
  • Documents containing foreign languages (to group in own area)
  • U.K. English words including U.S. English counterpart
  • Duplicate document files (to free disk space) (example)
  • Zero size document files
  • Unreadable or corrupt document files (example)

Additional Features

  • Context-based spell check (i.e."Storey County", NV)
  • Specialized business and science terminology is included
  • Full hyphenated word support
  • Top keyword phrase extraction for trend/content analysis (example)
  • HTML tag counts and population statistics (example)
  • HTML alternate image text summary report (example)
  • List of website addresses present in your documents (example)
  • List of e-mail addresses present in your documents (example)
  • List of unrecognized words you can use to add to a local custom dictionary
  • File selection filters to select specific documents by location and file name
  • MD5 file checksum listing for document modification monitoring
  • Robust and refined dictionaries to limit the number of unrecognized words
  • Bird's eye view of HTML tag content
  • No software to learn or install, simply request service via e-mail


This service currently supports the following document types:

 

  • HTML web pages
  • Microsoft Word Documents (.doc)
  • Active server pages (.asp/.aspx)
  • Java server pages (.jsp)
  • ASCII text files
  • Rich-text files (.rtf)
  • Adobe PDF files (.pdf)
  • HTML files containing an XML ID tag*
  • Cold Fusion Markup Language files (.cfm)
  • PHP web pages (.php)


* XML files can be processed when they solely contain English language words.


Service Limitations

This service is designed for documents which are available to the public.  This service does not have security safeguards needed to protect documents that contain sensitive information such as social security numbers, credit card numbers, military secrets, bank account numbers, etc.

Documents  heavily populated  with  foreign language characters  are not  good candidates for this service.   If your documents contain large segments of English; mixed with foreign languages, then the  "unrecognized items file"  may be larger;  thus making  it  more  time  consuming  to use that information to  create  a  custom  dictionary  for  your  document set.  This  service  can  currently recognize a  large number  of  Spanish, French and German  words preventing  those  words  from appearing in the unrecognized words list. Some Latin words such as those used in legal lingo may not be recognized.

Service cannot be provided for websites which are blocked by censors due to adult content such as blocking employed by public libraries or public Wi-Fi access points.

Smaller websites and document sets create a smaller "unrecognized items file" making it easier to create a custom dictionary for future spell checks.

This service cannot process web pages that use copy protection schemes or compressed, encrypted or password protected document files.  This service may not be able to process websites using exotic methods to render web pages.

See this website's Legal page for additional information regarding terms of service.

This service uses your website's "robots.txt" file to determine which files to check.






Allan Kirsch is a member of the business network.