Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
Lvg: 2005~2007 strip diacritics
Strip Diacritics:
As discussed in the Unicode Normalization, Normalization D can be used for stripping diacritics. A list of sample diacritics, which are stripped by this method, are shown as the follows.
Numeric Entity | Unicode | Symbol | Description | Stripped Character |
192 | \u00c0 | À | Capital A, grave accent | A |
193 | \u00c1 | Á | Capital A, acute accent | A |
194 | \u00c2 | Â | Capital A, circumflex accent | A |
195 | \u00c3 | Ã | Capital A, tilde | A |
196 | \u00c4 | Ä | Capital A, umlaut | A |
197 | \u00c5 | Å | Capital A, ring | A |
199 | \u00c7 | Ç | Capital C, cedilla | C |
200 | \u00c8 | È | Capital E, grave accent | E |
201 | \u00c9 | É | Capital E, acute accent | E |
202 | \u00ca | Ê | Capital E, circumflex accent | E |
203 | \u00cb | Ë | Capital E, umlant | E |
204 | \u00cc | Ì | Capital I, grave accent | I |
205 | \u00cd | Í | Capital I, acute accent | I |
206 | \u00ce | Î | Capital I, circumflex accent | I |
207 | \u00cf | Ï | Capital I, umlant | I |
209 | \u00d1 | Ñ | Capital N, tilde | N |
210 | \u00d2 | Ò | Capital O, grave accent | O |
211 | \u00d3 | Ó | Capital O, acute accent | O |
212 | \u00d4 | Ô | Capital O, circumflex accent | O |
213 | \u00d5 | Õ | Capital O, tilde | O |
214 | \u00d6 | Ö | Capital O, umlaut | O |
217 | \u00d9 | Ù | Capital U, grave accent | U |
218 | \u00da | Ú | Capital U, acute accent | U |
219 | \u00db | Û | Capital U, circumflex accent | U |
220 | \u00dc | Ü | Capital U, umlaut | U |
221 | \u00dd | Ý | Capital Y, acute accent | Y |
224 | \u00e0 | à | Small A, grave accent | a |
225 | \u00e1 | á | Small A, acute accent | a |
226 | \u00e2 | â | Small A, circumflex accent | a |
227 | \u00e3 | ã | Small A, tilde | a |
228 | \u00e4 | ä | Small A, umlaut | a |
229 | \u00e5 | å | Small A, ring | a |
231 | \u00e7 | ç | Small c, cedilla | c |
232 | \u00e8 | è | Small e, grave accent | e |
233 | \u00e9 | é | Small e, acute accent | e |
234 | \u00ea | ê | Small e, circumflex accent | e |
235 | \u00eb | ë | Small e, umlant | e |
236 | \u00ec | ì | Small i, grave accent | i |
237 | \u00ed | í | Small i, acute accent | i |
238 | \u00ee | î | Small i, circumflex accent | i |
239 | \u00ef | ï | Small i, umlant | i |
241 | \u00f1 | ñ | Small n, tilde | n |
242 | \u00f2 | ò | Small o, grave accent | o |
243 | \u00f3 | ó | Small o, acute accent | o |
244 | \u00f4 | ô | Small o, circumflex accent | o |
245 | \u00f5 | õ | Small o, tilde | o |
246 | \u00f6 | ö | Small o, umlaut | o |
249 | \u00f9 | ù | Small u, grave accent | u |
250 | \u00fa | ú | Small u, acute accent | u |
251 | \u00fb | û | Small u, circumflex accent | u |
252 | \u00fc | ü | Small u, umlaut | u |
253 | \u00fd | ý | Small y, acute accent | y |
255 | \u00ff | ÿ | Small y, umlaut | y |
Users may define their own diacritics stripping. The current default definitions in Lvg are shown in follows:
Numeric Entity | Unicode | Symbol | Description | Stripped Character |
216 | \u00d8 | Ø | Latin Capital Letter O With Stroke | O |
248 | \u00f8 | ø | Latin Small Letter O With Stroke | o |