万字长文入门前端全球化

多国内企业正积极开拓国际市场，如Shopee、阿里的Lazada、字节的TikTok、拼多多海外版Temu、以及服装快消领域的Shein等。当国内市场存量业务达到峰值预期时，海外业务成为各公司未来收入增长的主要动力，因此，国际化已成为越来越重要的职业发展方向。

国内IT企业收入天花板: 「10亿X2元X365天=7300亿元」，也就是10亿人口，企业每天赚取用户2元，保持365天，就是单业务增长的营收天花板(大部分业务赚不到2元，用户量也没到10亿)。比如视频如果60元一个月那会员营收天花板就可以这么预估. 甚至比这个还低, 毕竟用户会流失, 拉新也要成本, 运营成本是在递增的。

国际化不仅仅是多语言文案适配这么简单，而是一全套的工程化解决方案。笔者觉得更重要的是「从业人员需要具备全球视野，对多元文化有包容心和敬畏心理，同时知识面要求也较高」。比如，了解SEA、US、UK等常见地区的简写，尊重伊斯兰教的斋月节等习俗。对于服务全球用户的产品来说，对应产品的要求更加复杂，多样性体现在不同的文化习俗差异上，其实即便在庞大的中国内部也存在南北差异。了解的越多越发现这个世界的“多样性”。

概念说明

苹果键盘怎么卖多国?

苹果键盘有很多型号不同型号的布局不一样https://www.apple.com/shop/product/MK2A3J/A/magic-keyboard-japanese

apple-keyboard

那如何模仿苹果造一把可以卖到世界各地的键盘?

电路板等硬件配件统一生产
制定三种布局方案(Arabic, Russian, Ukrainian归为一种, Chinese (Zhuyin)和Korean为一种, Japanese为一种)，单独开孔
键帽印刷不同语言的文案
MacOS开发语言输入软件适配不同键盘的语言输入

其中2，3，4都是为产品的全球化服务

全球化=国际化i18n+本地化l10n

https://en.wikipedia.org/wiki/Internationalization_and_localization

globalization

产品设计和开发部署需要需要考虑国际化i18n

多语言
多布局，如阿拉伯语的RTL
多货币
全球多地区多机房部署(离用户越近服务体验越好，数据物理存储隔离符合各个国家数据安全要求)

产品本地化L10N是国际化后的「可选」流程，需要引入「本地化团队」转化和质量验收，再投入本地化市场

本地化步骤是可选的，如英美产品UI，语言基本一致可互通，本地化投入少
中文地区等有简体繁体，不同地区用语习惯不一样，也要特别兼顾，如香港的粤语和广州的粤语，在一些用词有区别，像吸管(广州)－饮筒(香港)，你可以看https://www.zhihu.com/question/20663233
阿拉伯语，希伯来语等地区RTL的阅读习惯，对产品改动较大需要特殊适配

国际化

产品面向全球用户，需要做语言适配，针对不同国家地区的用户提供对应语言的版本。本质是「文本替换」，也要考虑文本阅读方向，比如阿拉伯语和希伯来语是从右到左。

可以看下Apple的做法，对不同国家地区提供不同服务

Apple US 对应链接 https://www.apple.com/
Apple CN 对应链接 https://www.apple.com.cn/
Apple HK 对应链接 https://www.apple.com/hk/en/

常见地区语言对应关系可以看 ISO 3166-1(https://baike.baidu.com/item/ISO%203166-1/5269555?fr=ge_ala)

Intl

MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl

浏览器中的 Intl 是一个内置的 JavaScript 对象，用于国际化（Internationalization）支持。它提供了处理日期、时间、数字格式化以及货币和语言的能力，以便网站能够根据用户的语言和地区习惯来显示内容。在 JavaScript 中，您可以使用 Intl 对象来执行以下操作：

格式化日期和时间：使用 Intl.DateTimeFormat 对象可以格式化日期和时间，并根据用户的地区偏好显示日期和时间。
格式化数字：使用 Intl.NumberFormat 对象可以格式化数字，并根据用户的地区偏好显示数字。
处理货币：使用 Intl.NumberFormat 对象结合指定的货币代码可以格式化货币，并根据用户的地区偏好显示货币。
语言和区域设置信息：使用 Intl.getCanonicalLocales 方法可以获取支持的语言和区域设置信息。关于标点符号，通常在国际化的环境下，标点符号使用英文是比较常见的做法，因为英文标点符号在全球范围内都有通用性，可以避免因为地域差异而引起的误解。在 Intl 对象中，一般不会直接涉及到标点符号的处理，而是主要用于处理日期、时间、数字和货币等格式化需求。

前置知识

语言标识

全地球有N个民族，有的民族有自己的语言, 有的民族用其他国家民族传递过来的语言, 融化吸收然后发展自己的文字。

按照ISO标准(https://zh.m.wikipedia.org/wiki/ISO_639-1)，语言可以用大类+小类表示, 比如「zh」就是汉语，是一个大类，而「zh-CN」就是简体中文的缩写, 新加坡华人众多久了就有「zh-SG」, 表示的是新加坡使用的中文，其次还有「zh-HK/zh-TW和zh-Hant/zh-Hans」等等

语言声明是三段式结构 [language]-[script]-[region] , 如zh-Hans-CN表示中国地区简体中文, zh-Hant表示所有中文繁体

Language Code Table(http://www.lingoes.net/zh/translator/langcode.htm)

一起来看下苹果官网是如何适配多国语言的
澳门apple https://www.apple.com/mo/
香港apple
英文 https://www.apple.com/hk/en/
中文 https://www.apple.com/hk/
中国大陆地区apple https://www.apple.com.cn/
台湾apple https://www.apple.com/tw/
新加坡apple https://www.apple.com/sg/
日本apple https://www.apple.com/jp/
可以看到有的是根域名下通过ISO地区的path比如**/hk/**这样来区分的，有的是直接换域名，比如中国大陆地区

文字阅读顺序

按照普通的中文和英文顺序，都是LTR，上到下，都是世界范围通用的

而ar阿拉伯语, ur乌都语, he希伯来语都是特殊的从右到左, 即RTL的一般会通过标签的dir属性标识, 比如下面的解释HTML dir Attribute(https://www.w3schools.com/tags/att_global_dir.asp)

HEBREW是指希伯来语，这是一种在以色列广泛使用的语言，也是犹太教的宗教经典文本的原始语言。它属于阿夫罗亚细亚语系，有着悠久的历史和文化价值。「希伯来语有其独特的书写系统，从右向左书写。」上图的概念很少人普及, 因为非国际化产品不需要多语言, 做需要支持海外业务和全球应用的同学可以多了解下. 传统的英文, 中文简体, 拉丁文等都是上图LATIN的阅读顺序, 如果用上「top, 下bottom, 左left, 右right」代表我们的习惯, 也就是「Z」这样的顺序. 即行到行是从上到下的顺序, 行内阅读顺序是从左到右.

文档流和阅读顺序

即left→right, top→bottom的顺序，有主次分别，left→right的优先级高于top→bottom

而Web标准对其定义是下面这样的

left=inline-start
right=inline-end
top=block-start
bottom=block-end

讲个笑话, 古代书籍就是按照 writing-mode: vertical-rl 排版的

joke

布局

content-flows

比如 margin: left 或者 text-align: left 在多语言场景都是不合适的，你的左右不是其他人的左右。

而应该用 margin-inline-start 和 text-align: start 替代，即inline轴和block轴

// 下面两两相等，请抛弃left/right/top/bottom等属性
// https://web.dev/learn/css/logical-properties/#terminology

margin-left: 1px
margin-inline-start: 1px

margin-right: 1px
margin-inline-end: 1px

margin-top: 1px
margin-block-start: 1px

margin-bottom: 1px
margin-block-end: 1px

text-align: left
text-align: start

text-align: right
text-align: end

max-width: 100px
max-inline-width: 100px
max-inline-size: 150px

max-height: 100px
max-block-width: 100px

padding-left: 1px
padding-inline-start: 1px

padding-top: 1px
padding-block-start: 1px

top: 0.2em;
inset-block-start: 0.2em;

bottom: 0.2em;
inset-block-end: 0.2em;

left: 2px;
inset-inline-start: 2px;

right: 2px;
inset-inline-end: 2px;

border-bottom: 1px solid red;
border-block-end: 1px solid red;

border-bottom-right-radius: 1em;
border-end-end-radius: 1em;

height: 160px;
block-size: 160px;

width: 160px;
inline-size: 160px;

也可以看下面的例子

https://codepen.io/web-dot-dev/pen/gOxXOLK
https://codepen.io/web-dot-dev/pen/mdMqdOx

如上两个例子通过margin-inline-start等属性，再在html元素上添加 dir: rtl 就可以实现多语言的阅读顺序兼容

由此, 常见的布局也会更新为以下形式，常见的物理盒模型用于尺寸计算, 逻辑盒模型用于国际化处理

盒子模型

writing mode 决定 content-flows

上面写了文档有inline and block flow，对应english的left和right，top和bottom。而 writing-mode 可以修改content-flows，比如下面的值

/* 关键值 */
writing-mode: horizontal-tb;
writing-mode: vertical-rl;
writing-mode: vertical-lr;

可以这么理解 writing-mode: horizontal-tb ，前面的horizontal/vertical是指的inline轴的方向，

https://codepen.io/manfredhu/pen/xxWdpaK

vi和vb

视口宽高viewport在这里也有特殊含义. 比如宽高vw和vh也被取代，用 vi(viewport inline) 和 vb(viewport block)替代

1%宽度=1vw=1vi 1%高度=1vh=1vb

JS的scrollLeft

DOM的API可以通过「Element.scrollLeft」获取到元素滚动的距离，下图是一个实际例子

scrollLeft的rtl

这里在最后做了一个遮罩（绿色边框区域），内部蓝色部分类似一个走马灯，通过overflow:hidden将蓝色高亮部分超出的区域遮住

当蓝色部分滚动到最后，绿色遮罩隐藏，达到一个遮盖，滚动到最后消失的效果，代码如下

const ref=document.querySelector('.tiktok-table__container') // 父节点，蓝色区域
const ref2=document.querySelector('.tiktok-table__container > table') // 子节点，表格区域
const bufferWidth=30 // 留一点buffer空间
if (ref && ref2 && ref.clientWidth + ref.scrollLeft >=ref2.clientWidth - bufferWidth) {
  // 滚动到最后隐藏绿色遮罩
  setTableRightMask(false)
} else {
  setTableRightMask(true)
}

但是在RTL下，神奇的事情就发生了，scrollLeft居然是负数

这是因为RTL的实现是通过HTML标签增加属性 dir="rtl” 实现的，会将文档完全翻转过来，所以scrollLeft就会是负数。因为此时(0, 0)这个原点已经是表格右边了

解决方法也很简单，取绝对值呗，这样就忽略了方向的影响

locale

根据ISO标准对全球国家地区进行划分https://en.wikipedia.org/wiki/ISO_3166-2. 如 "US" 表示美国，"CN" 表示中国. 还有常见的如「zh-CN, en-US, en-GB等」

CN是国家地区码, 根据国际标准 ISO 3166-1 规定的国家和地区代码。ISO 3166-1 是用于标识国家和地区的国际标准，每个国家或地区都有一个唯一的两字母代码。"CN" 代表中华人民共和国（People's Republic of China），即中国
zh-CN是语言地区码, 它通常用于表示中文（"zh" 代表中文）以及特定的地区或国家，这里 "CN" 代表中国。"zh" 代表中文，这是根据国际标准 ISO 639-1 规定的语言代码。ISO 639-1 是用于标识语言的国际标准，每个语言都有一个唯一的两字母代码。"zh" 代表中文，但不区分中文的不同方言，如普通话和粤语。
zh-Hans-CN 表示中国大陆地区的简体中文，还有"zh-Hans-SG" 可用于表示新加坡的官方简化中文，"zh-Hans-TW" 可用于表示台湾的官方简化中文

Intl. Locale

举个说下Intl API对于locale的定义

const korean=new Intl.Locale('ko', {
  script: 'Kore',
  region: 'KR',
  hourCycle: 'h23',
  calendar: 'gregory',
});

const japanese=new Intl.Locale('ja-Jpan-JP-u-ca-japanese-hc-h12');

console.log(korean.baseName, japanese.baseName);
// Expected output: "ko-Kore-KR" "ja-Jpan-JP"

可以看到Intl. Locale就是把传入的字符串拆解为 [language]-[script]-[region] 的组成.

ja：代表语言代码，表示日语（Japanese）
Jpan：代表脚本代码，表示使用日语文字（Japanese script）
JP：代表地区代码，表示日本（Japan）这里的 u-ca-japanese 表示unicode calendar也就是日历格式(日本日历与众不同), hc表示hourCycle这里hc-h12表示12小时制. u-ca-japanese 和 hc-h12 的顺序无关, 也就是说如下两种用法完全等价

const japanese=new Intl.Locale('ja-Jpan-JP-u-ca-japanese-hc-h12');
const japanese2=new Intl.Locale('ja-Jpan-JP-u-hc-h12-ca-japanese');

-u (unicode)可以理解为额外扩展插件, 插件系统支持以下扩展. 如上使用calendar扩展和hourCycle扩展

calendarca (extension)caseFirstkf (extension)collationco (extension)hourCyclehc (extension)numberingSystemnu (extension)numerickn (extension)

calendar：https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/calendar)
caseFirst：https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/caseFirst)
collation：https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/collation)
hourCycle：https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/hourCycle)
numberingSystem：https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/numberingSystem)
numeric：https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/numeric)

语言声明

<html lang="en-US">
<!-- 说明页面语言是美式英文 -->
<p>I felt some <span lang="de">schadenfreude</span>.</p> <!-- lang不是只有html标签才有, 其他标签也可以添加 -->
<a href="/path/to/german/version" hreflang="de" lang="de">Deutsche Version</a> <!-- a标签上可以用lang表示显示的文本的语言, 也可以用hreflang表示跳转页面的语言 -->

language设置

语言属性通常代表语言标识符的主要部分。它标识了用于表达语言的基本信息。例如，在 BCP 47 标准中，语言标识符通常包含了语言的主要代码部分，例如 "en" 代表英语，"es" 代表西班牙语。

const locales=['en', 'de', 'ja'];
const displayNames=new Intl.DisplayNames('en-US', { type: 'language' });
locales.forEach(locale=> console.log(displayNames.of(locale)));

// English
// German
// Japanese

如果已经有如上代码, 再进一步给这些内容添加样式是非常简单的, 我们可以使用CSS的选择器. 如 [lang|="fr"] 或者 :lang(fr)

[lang|="fr"] 选择属性 lang=fr 或者lang属性以fr开头的元素, 如 lang=fr-CA

script设置

脚本属性是语言标识符的可选部分，表示使用的书写系统或文字的风格。这是一个辅助信息，用于更精确地表示特定语言的书写习惯。例如，"Hans" 代表简体中文，"Latn" 代表拉丁文。

script的of支持传入BCP47规范的二字码(https://en.wikipedia.org/wiki/IETF_language_tag), 如zh

const scriptNames=new Intl.DisplayNames('en-US', { type: 'script' });
console.log(scriptNames.of('Hans')); // output：Simplified
console.log(scriptNames.of('Hant')); // output：Traditional
console.log(scriptNames.of('Latn')); // output：Latin

const scriptNames=new Intl.DisplayNames('zh-CN', { type: 'script' });
console.log(scriptNames.of('Hans')); // output：简体
console.log(scriptNames.of('Hant')); // output：繁体
console.log(scriptNames.of('Latn')); // output：拉丁文

region设置

region 的of支持传入https://en.wikipedia.org/wiki/ISO_3166-2 里的国家二字码

const regionNamesInEnglish=new Intl.DisplayNames(['en'], { type: 'region' });
const regionNamesInTraditionalChinese=new Intl.DisplayNames(['zh-Hant'], { type: 'region' });
console.log(regionNamesInEnglish.of('US'));
// Expected output: "United States"
console.log(regionNamesInTraditionalChinese.of('US'));
// Expected output: "美國"

文本

阅读顺序声明

文本的阅读顺序声明

html标签dir属性

可以通过html标签的dir属性设置

<html dir="ltr">
<html dir="rtl">

css属性writing-mode(https://developer.mozilla.org/en-US/docs/Web/CSS/writing-mode)

writing-mode: horizontal-tb;
writing-mode: vertical-lr;
writing-mode: vertical-rl;

文本行的布局方向, 以及块的排列方向。如果想作用在整个文档需要设置html标签, 则全局生效

「第一个属性horizontal/vertical指的是行块的排列, 第二个属性则是指文本内容的流向(content flows)」

翻转

形如 「?」 这种符号类，在多语言下是不一样的

比如ar阿拉伯语和ur乌尔都斯语问号是RTL的，即 「?」(https://zh.m.wiktionary.org/wiki/%D8%9F)是OK的

而he希伯来语是LTR的，即 「?」 是OK的

是不是很神奇？一个问号也能玩出花来

线索管理页-英文和阿拉伯语

常见的需要RTL的语言有下面这些

「阿拉伯语（AR）」：阿拉伯语是使用 RTL 方向书写的最著名的语言之一。它是中东和北非地区的主要语言，以及伊斯兰教的官方语言。
「希伯来语（HE）」：希伯来语是犹太人的宗教和文化语言，以及以色列的官方语言。它也是一个使用 RTL 方向书写的语言。
「波斯语（FA）」：波斯语，也称为法尔西语，是伊朗的官方语言，以及一些中东国家的官方或辅助语言。它使用 RTL 方向书写。
「乌尔都语（UR）」：乌尔都语是巴基斯坦和印度的官方语言之一，以及一种使用 RTL 方向书写的语言。
「帕斯图语（PS）」：帕斯图语是阿富汗的官方语言之一，也是使用 RTL 方向书写的语言。

const rtlLangs=[
    'ar', // 阿拉伯语
    'ur', // 巴基斯坦
    'he', // 以色列
    'he-IL', // 希伯来语(以色列)
    'fa-IR', // 波斯语(伊朗)
    'ps' // 帕斯图语
];

多语言文案

l10n本地化的一个比较多工作量的部分是文本的翻译, 一种文本到N种文本的翻译需要引入本地化团队. 技术实现上选择也很多

程序打包嵌入文案

通过 key: text 映射, 比如 t('key') 最后程序跑出来就是text文案, 这种方式不会依赖其他东西, 跟普通网页一样内容都是CDN文件. 缺点是文案做为静态资源需要用户额外获取, 如果处理不好替换错误就展示 key 内容而不是

vue i18n：https://kazupon.github.io/vue-i18n/zh/started.html#html
react i18n：https://react.i18next.com

以下例子以Vue为例, 配置如「en.json, fr.json」等等的静态配置文案, 打包嵌入CDN的JS文件里

Vue i18n example：https://codesandbox.io/p/sandbox/o7n6pkpwoy?file=%2Fstore.js%3A10%2C14

接口获取

程序运行时通过接口拿文案，可以通过html标签添加query参数 lang=xxx 标记页面语言, 或者cookie标记语言选择

实时翻译替换

加载翻译的脚本, 在切换语言的时候替换掉加载的文本。好处是加载的脚本是当前语言所需要的, 不会有其他语言的冗余. 缺点是依赖一个翻译服务, 如果翻译服务宕机了网页就不能正常访问了

User -> gateway -> SSR -> i18n cache -> read-time translation services(实时翻译服务)

占位符与单复数处理-ICU语法

DevPal - ICU message editor：https://devpal.co/icu-message-editor/?data=I%20have%20%7Bnum%2C%20plural%2C%20one%7B%7Bnumber%7D%20%23%20apple%7D%20other%7B%23%20apples%7D%7D%2Cbut%20it%27s%20too%20small%0A

ICU语法即通用的有if-else逻辑的DSL，如下DSL可以根据传入的值换取不同的表示，常用于国际化业务

I have {num, plural, one{{number} # apple} other{# apples}},but it's too small

Intl. Segmenter 分段器

如果你用过vim一定知道w(word)可以移动到下个单词, 英文里把文本分为单词、句子和段落，同理中文也是

「w」 word 一个单词
「s」 sentence 一个句子
「p」 paragraph 一个段落

const segmenter=new Intl.Segmenter('en-US', { granularity: 'word' });

const text='This is a sample text for demonstration purposes.';

// 使用 Segmenter 对文本进行分割
const segments=[...segmenter.segment(text)];
console.log(segments);

// 0: {segment: 'This', index: 0, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 1: {segment: ' ', index: 4, input: 'This is a sample text for demonstration purposes.', isWordLike: false}
// 2: {segment: 'is', index: 5, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 3: {segment: ' ', index: 7, input: 'This is a sample text for demonstration purposes.', isWordLike: false}
// 4: {segment: 'a', index: 8, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 5: {segment: ' ', index: 9, input: 'This is a sample text for demonstration purposes.', isWordLike: false}
// 6: {segment: 'sample', index: 10, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 7: {segment: ' ', index: 16, input: 'This is a sample text for demonstration purposes.', isWordLike: false}
// 8: {segment: 'text', index: 17, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 9: {segment: ' ', index: 21, input: 'This is a sample text for demonstration purposes.', isWordLike: false}
// 10: {segment: 'for', index: 22, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 11: {segment: ' ', index: 25, input: 'This is a sample text for demonstration purposes.', isWordLike: false}
// 12: {segment: 'demonstration', index: 26, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 13: {segment: ' ', index: 39, input: 'This is a sample text for demonstration purposes.', isWordLike: false}
// 14: {segment: 'purposes', index: 40, input: 'This is a sample text for demonstration purposes.', isWordLike: true}
// 15: {segment: '.', index: 48, input: 'This is a sample text for demonstration purposes.', isWordLike: false}

Intl. Segmenter分段器可以把句子, 段落, 文章等按照配置切割为不同的segment数组, 结构类似正则, 有segment属性

再举个例子, 中文语境下「真的」其实是一个词

// 创建分段器，指定语言环境和分段类型为'word'
const segmenter=new Intl.Segmenter(['en', 'zh'], { granularity: 'word' });

// 要分割的字符串
const text='Hello世界Hello world';

// 使用分段器分割字符串
const segments=segmenter.segment(text);

// 遍历并打印每个分段的结果
for (const segment of segments) {
    console.log(`Segment: ${segment.segment}, Index: ${segment.index}, IsWordLike: ${segment.isWordLike}`);
}
// Segment: Hello, Index: 0, IsWordLike: true
// Segment: 世界, Index: 5, IsWordLike: true
// Segment: Hello, Index: 7, IsWordLike: true
// Segment:  , Index: 12, IsWordLike: false
// Segment: world, Index: 13, IsWordLike: true

const str="我真的很强, 强哥的强";
const segmenterJa=new Intl.Segmenter("zh-CN", { granularity: "word" });

const segments=segmenterJa.segment(str);
console.log(Array.from(segments));

// 0: {segment: '我', index: 0, input: '我真的很强, 强哥的强', isWordLike: true}
// 1: {segment: '真的', index: 1, input: '我真的很强, 强哥的强', isWordLike: true}
// 2: {segment: '很', index: 3, input: '我真的很强, 强哥的强', isWordLike: true}
// 3: {segment: '强', index: 4, input: '我真的很强, 强哥的强', isWordLike: true}
// 4: {segment: ',', index: 5, input: '我真的很强, 强哥的强', isWordLike: false}
// 5: {segment: ' ', index: 6, input: '我真的很强, 强哥的强', isWordLike: false}
// 6: {segment: '强', index: 7, input: '我真的很强, 强哥的强', isWordLike: true}
// 7: {segment: '哥', index: 8, input: '我真的很强, 强哥的强', isWordLike: true}
// 8: {segment: '的', index: 9, input: '我真的很强, 强哥的强', isWordLike: true}
// 9: {segment: '强', index: 10, input: '我真的很强, 强哥的强', isWordLike: true}

时间&时区

国际化会有时区划分问题, 时区产生于太阳下地球自转导致的昼夜交替. 而全球不同国家地区当地时间与UTC时间是不一致的. 全球大部分人都可以说自己早上起床, 晚上睡觉. 上下文是通的. 但是这个早上的时间根据UTC来定义是不一样的

GMT和UTC

GTM=Greenwich Mean Time，GTM是英国格林威治时间，但是与太阳时偏差较大，已成为历史不再作为标准
UTC=「协调世界时（UTC: Coordinated Universal Time）- 由原子钟提供」

时间的往事--记一次与夏令时的斗智斗勇：https://jiangyixiong.top/2021/05/25/%E6%97%B6%E9%97%B4%E7%9A%84%E5%BE%80%E4%BA%8B%E2%80%94%E2%80%94%E8%AE%B0%E4%B8%80%E6%AC%A1%E4%B8%8E%E5%A4%8F%E4%BB%A4%E6%97%B6%E7%9A%84%E6%96%97%E6%99%BA%E6%96%97%E5%8B%87

GMT 标准时间全球时区查询：https://time.artjoey.com/cn

通过NTP协议(https://zh.wikipedia.org/wiki/%E7%B6%B2%E8%B7%AF%E6%99%82%E9%96%93%E5%8D%94%E5%AE%9A), 让计算机在全球网络里保持时间一致

「Offset与Timezone」

Offset即偏移量，比如中国在东八区，Offset是+08:00:00
而东八区不止包括中国时间，而是一组东西经符合一个区域的集合，比如

东八区={CST（中国标准时）,SGT（新加坡时间）,AWST（澳洲西部标准时）... }

如何获取当前用户的时区信息

// 所在地区的时区标识符, 如 America/New_York
const timeZone=new Intl.DateTimeFormat().resolvedOptions().timeZone;
console.log("用户时区偏移：" + timeZone); // 用户时区偏移：Asia/Shanghai

// 获取本地时间与UTC时间偏移值,最小单位是分钟. 如"-480", 表示-8小时. 其中正负表示UTC前后, 如美国东部时间是UTC-5, 中国北京时间是UTC+8
const date=new Date();
const timeZoneOffset=date.getTimezoneOffset();
console.log("时区偏移：" + timeZoneOffset); // 时区偏移：-480

Intl是新的浏览器API, 与Math类似是全局静态对象, 专门用于处理国际化和本地化业务. 其下的DateTimeFormat可以处理时间相关国际化问题

DST

DST (Daylight saving time)，日光节约时，夏令时/冬令时等等名称。「它会在每年春天的某一天将时钟向后拨一小时，又在秋天的某一天将时钟向前拨动一个小时。」非国际化业务很少遇到这个情况，主要因为「中国不实行夏令时/冬令时。」

为什么要实行夏令时？一战时德国率先实行，将每年夏天增加1h，冬天较少1h
会产生什么现象？因为是行政约定，每年都可以自由选择某天某时进入夏令时，各国自由发布。IANA会存储(https://www.iana.org/time-zones)同步各国DST，计算机每小时同步时间后会在某一秒发生「突变」，比如1:59到2点的时候突变会1:00
计算机如何表示时间？计算机都有一个unixTime，它表示当前时间距离世界标准时的1970年1月1日0点0分0秒的毫秒数，是一个绝对值，也就是UTC时间
但是不同地区设备会根据本地有一个格式化，将UTC时间转化为本地时间，比如中国在东八区

2021-03-14 01:59:59 GMT-08:00（太平洋标准时间，PST）
2021-03-14T01:59:59.000-08:00（ISO格式表示）
2021-03-14T09:59:59.000Z（转换为UTC时间并以ISO格式表示）

// 下一秒时间突变
2021-03-14 03:00:00 GMT-07:00（太平洋夏令时间，PDT）
2021-03-14T03:00:00.000-07:00（ISO格式表示）
2021-03-14T10:00:00.000Z（转换为UTC时间并以ISO格式表示）

// 原始时间字符串
const timeString="2021-03-14T09:59:59.000Z";

// 将时间字符串转换为 Date 对象
const date=new Date(timeString);
const pstOutput=date.toLocaleString("en-US", { timeZone: "America/Los_Angeles", hour12: false });
console.log(pstOutput); // 3/14/2021, 01:59:59

// 获取时间戳
const timestamp=date.getTime();

// 增加1秒
const newTimestamp=timestamp + 1000;

// 创建新的 Date 对象并格式化为 PDT 时间
const newDate=new Date(newTimestamp);
const pdtOutput=newDate.toLocaleString("en-US", { timeZone: "America/Los_Angeles", hour12: false });

console.log(pdtOutput); // 3/14/2021, 03:00:00

时间处理

Dayjs插件

dayjs: https://day.js.org/docs/zh-CN/i18n/i18n

国际化支持 https://github.com/iamkun/dayjs/tree/dev/src/locale

原理：通过拉取多语言文案输出不同的formated日期时间字符串

可以看这个demo

Days of the week：https://codesandbox.io/s/dayjs-dynamic-locale0import-forked-wnk2zq?file=/src/index.js

因我本地系统设置了每周第一天为星期日

Intl API

const date=new Date();
const formattedDate=new Intl.DateTimeFormat('en-US').format(date);
console.log(formattedDate); // 10/29/2023
const formattedDate=new Intl.DateTimeFormat('zh-CN').format(date);
console.log(formattedDate); // 2023/10/29

本地时间输出

// 创建 DateTimeFormat 对象，并指定语言和地区
const dateFormatterCN=new Intl.DateTimeFormat('zh-CN', {
  year: 'numeric',
  month: 'long', // 使用完整的月份名称
  day: 'numeric',
});
console.log(dateFormatterCN.format(new Date('2024-04-28'))); // 2024年4月28日

const dateFormatterUS=new Intl.DateTimeFormat('en-US', {
  year: 'numeric',
  month: 'long', // 使用完整的月份名称
  day: 'numeric',
});
console.log(dateFormatterUS.format(new Date('2024-04-28'))); // April 28, 2024

「Intl. RelativeTimeFormat」相对时间

「Intl.RelativeTimeFormat」 是 JavaScript 中的国际化 API，用于格式化相对时间，例如“1 小时前”或“2 天后”。这个 API 可以根据不同的语言和地区设置，以自然语言的方式呈现相对时间，使应用程序能够更好地适应多语言环境。

const rtf1=new Intl.RelativeTimeFormat('zh', { style: 'short' });

console.log(rtf1.format(3, 'quarter'));
// Expected output: "3个季度后"

console.log(rtf1.format(-1, 'day'));
// Expected output: "1天前"

const rtf2=new Intl.RelativeTimeFormat('jp', { numeric: 'auto' });

console.log(rtf2.format(2, 'day'));
// Expected output: "后天"

我们知道中文语境是一万以上可以缩写为1万, 或者是 1 0000. 也就是4位数字. 比如 1 2345 6789或者1’2345’6789(’是万位分隔符)可以一眼看出来是一亿两千三百四十五万六千七百八十九. 而如果是123, 456, 789可能很多人会愣很久重新数才知道是多少. 但是现在很多银行APP都在推跟欧美一样的属于后者的千位分隔符. 可以看这篇讨论觉得写的在理

设计产品时，你是如何掉入从众的陷阱中的？– 人人都是产品经理：https://www.woshipm.com/pd/1500589.html)

类似以上例子可以再看下面的举例, 可以发现在德语和法语下, 千分位分隔符分别是.和（空格）

const number=1234567.89;
const formattedNumber=new Intl.NumberFormat('zh-CN').format(number);
console.log(formattedNumber); // 1,234,567.89

const number=1234567.89;
const formattedNumber=new Intl.NumberFormat('en-US').format(number);
console.log(formattedNumber); // 1,234,567.89

const number=1234567.89;
const formattedNumber=new Intl.NumberFormat('de-DE').format(number);
console.log(formattedNumber); // 1.234.567,89

const number=1234567.89;
const formattedNumber=new Intl.NumberFormat('fr-FR').format(number);
console.log(formattedNumber); // 1 234 567,89

单复数

英文复数是要加s的, 比如apples

const numbers=[1, 2, 5, 10, 100];
for (const number of numbers) {
  const pluralRules=new Intl.PluralRules('en-US'); // 使用英语环境
  const pluralForm=pluralRules.select(number);

  console.log(`In English, ${number} item${pluralForm !=='one' ? 's' : ''}.`);
}
// In English, 1 item.
// In English, 2 items.
// In English, 5 items.
// In English, 10 items.
// In English, 100 items.

再比如顺序, 第一第二第三, 英文分别为 first, second, third, fourth, fifth. 聪明的你一定发现规律了. 除了123后面就是数字+th. 简写是1st 2nd. 根据下表可以发现规律

1 → st, 后面除了11外. 21-91都是21st, 91st这种
2→ nd, 后面除了12外. 22-92都是22nd, 92nd这种
3 → rd, 后面除了13外. 23-93都是23rd, 93rd这种
其他都是补th

数字英文第N1One1st2Two2nd3Three3rd4Four4th10Ten10th11Eleven11th12Twelve12th13Thirteen13th20Twenty20th21Twenty-one21st30Thirty22nd31Thirty-one21st100One hundred100th

const enOrdinalRules=new Intl.PluralRules("en-US", { type: "ordinal" });

const suffixes=new Map([
  ["one", "st"],
  ["two", "nd"],
  ["few", "rd"],
  ["other", "th"],
]);
const formatOrdinals=(n)=> {
  const rule=enOrdinalRules.select(n);
  const suffix=suffixes.get(rule);
  return `${n}${suffix}`;
};

formatOrdinals(0); // '0th'
formatOrdinals(1); // '1st'
formatOrdinals(2); // '2nd'
formatOrdinals(3); // '3rd'
formatOrdinals(4); // '4th'
formatOrdinals(11); // '11th'
formatOrdinals(21); // '21st'
formatOrdinals(42); // '42nd'
formatOrdinals(103); // '103rd'

数字格式化

整数分隔和小数分隔

常见的整数分隔符号有千分位分隔, 比如 1000,000 也有万位分隔比如 1000 0000 . 不同语言不一样

常见的小数分隔符号 . , 比如 1000.00 . 不同语言不一样

const number=1234567.89;
// 格式化为默认数字格式
const formattedNumber=new Intl.NumberFormat().format(number);
console.log(formattedNumber); // 输出: 1,234,567.89
// 格式化为指定语言环境的数字格式
const formattedNumberDE=new Intl.NumberFormat('de-DE').format(number);
console.log(formattedNumberDE); // 输出: 1.234.567,89
// 格式化为指定语言环境的数字格式
const formattedNumberFR=new Intl.NumberFormat('fr-FR').format(number);
console.log(formattedNumberFR); // 输出: 1 234 567,89
const formattedNumberCN=new Intl.NumberFormat('zh-CN').format(number);
console.log(formattedNumberCN)  // 输出: 1,234,567.89

也可以通过参数配置控制小数部分最多/最少有多少位

const number=1234567.89123;
const formattedNumber=new Intl.NumberFormat('en-US', {
  style: 'decimal', // 可选 'decimal' 表示常规数字格式
  maximumFractionDigits: 3, // 小数部分最多显示三位
}).format(number);
console.log(formattedNumber); // 输出: 1,234,567.891

百分比

正常百分比是0-100数字+%, 但是法语环境百分比符号习惯是 '% '而不是'%', 多了一个空格

const percentage=0.75;
// 使用默认语言环境
const formattedPercentageDefault=new Intl.NumberFormat('fr-FR', {
  style: 'percent'
}).format(percentage);
console.log(formattedPercentageDefault); // 输出: '75 %'
// 使用指定语言环境
const formattedPercentageFR=new Intl.NumberFormat('fr-FR', {
  style: 'percent',
  minimumFractionDigits: 2,
  maximumFractionDigits: 2,
}).format(percentage);
console.log(formattedPercentageFR); // 输出: '75,00 %'
// 使用默认语言环境
const formattedPercentageUS=new Intl.NumberFormat('en-US', {
  style: 'percent'
}).format(percentage);
console.log(formattedPercentageUS); // 输出: '75%'
// 使用指定语言环境
const formattedPercentageCN=new Intl.NumberFormat('zh-CN', {
  style: 'percent',
  minimumFractionDigits: 2,
  maximumFractionDigits: 2,
}).format(percentage);
console.log(formattedPercentageCN); // 输出: '75.00%'

缩写

console.log(new Intl.NumberFormat('en-US', { notation: "compact" , compactDisplay: "short", maximumFractionDigits: 2 }).format(987654321)) // 987.65M
console.log(new Intl.NumberFormat('zh-CN', { notation: "compact" , compactDisplay: "short", maximumFractionDigits: 2 }).format(987654321)) // 9.88亿

货币

货币符号

比如人民币是￥ , 美元是 $ , 欧元  , 英镑￡

new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD' }).formatToParts().filter(i=> i.type==='currency')[0].value // '$'
new Intl.NumberFormat('zh-CN', { style: 'currency', currency: 'CNY' }).formatToParts().filter(i=> i.type==='currency')[0].value // '￥'
new Intl.NumberFormat('de-DE', { style: 'currency', currency: 'EUR' }).formatToParts().filter(i=> i.type==='currency')[0].value // ''

货币格式化

用常见的几个经济体和身边用的多的case举例说明, 注意看输出

// 美元 $是美元符号
const numberUSD=123456789.12;
const formattedNumberUSD=new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD' }).format(numberUSD);
console.log(formattedNumberUSD); // $123,456,789.12

// 人民币 ￥是人民币符号
const numberCNY=123456789.12;
const formattedNumberCNY=new Intl.NumberFormat('zh-CN', { style: 'currency', currency: 'CNY' }).format(numberCNY);
console.log(formattedNumberCNY); // ￥123,456,789.12

// 欧元 是欧元符号
const numberEUR=123456789.12;
const formattedNumberEUR=new Intl.NumberFormat('de-DE', { style: 'currency', currency: 'EUR' }).format(numberEUR);
console.log(formattedNumberEUR); // 123.456.789,12 

// 日元
const numberJPY=123456789.12;
const formattedNumberJPY=new Intl.NumberFormat('ja-JP', { style: 'currency', currency: 'JPY' }).format(numberJPY);
console.log(formattedNumberJPY); // ￥123,456,789

// 英镑 ￡是英镑符号
const numberGBP=123456789.12;
const formattedNumberGBP=new Intl.NumberFormat('en-GB', { style: 'currency', currency: 'GBP' }).format(numberGBP);
console.log(formattedNumberGBP); // ￡123,456,789.12

// 港币
const numberHKD=123456789.12;
const formattedNumberHKD=new Intl.NumberFormat('zh-HK', { style: 'currency', currency: 'HKD' }).format(numberHKD);
console.log(formattedNumberHKD); // HK$123,456,789.12

// 韩元
const numberKRW=123456789.12;
const formattedNumberKRW=new Intl.NumberFormat('ko-KR', { style: 'currency', currency: 'KRW' }).format(numberKRW);
console.log(formattedNumberKRW); // ?123,456,789.12

货币的兼容性兜底可以用 Number.prototype.toLocaleString 实现, 也可以用formatjs提供的polyfill

// 美元 $是美元符号
const numberUSD=123456789.12;
const formattedNumberUSD=new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD' }).format(numberUSD);
const formatttdNumberUSDByLocaleString=Number(numberUSD).toLocaleString('en-US', { style: 'currency', currency: 'USD' });
console.log(formattedNumberUSD); // $123,456,789.12
console.log(numberUSD.toLocaleString()) // 123,456,789.12
console.log(formatttdNumberUSDByLocaleString) // $123,456,789.12

货币单位显示

比如美国是美元, 中国有人民币. 可以直接格式化出来

currencyNames=new Intl.DisplayNames(["zh-Hans"], { type: "currency" });
console.log(currencyNames.of("USD")); // "美元"
console.log(currencyNames.of("EUR")); // "欧元"
console.log(currencyNames.of("TWD")); // "新台币"
console.log(currencyNames.of("CNY")); // "人民币"

currencyNames=new Intl.DisplayNames(["zh-Hant"], { type: "currency" });
console.log(currencyNames.of("USD")); // "美元"
console.log(currencyNames.of("EUR")); // "歐元"
console.log(currencyNames.of("TWD")); // "新台幣"
console.log(currencyNames.of("CNY")); // "人民幣"

排序&列表

Intl. Collator

常见的电话本, 地址簿排序. 不同语言因为字母转换后排序不一致. 「Intl.Collator」 是 JavaScript 的国际化 API 之一，用于字符串比较和排序，以便在多语言环境中执行正确的排序操作。它允许你创建一个 「Collator」 对象，用于根据特定语言和区域设置执行字符串比较和排序，考虑到不同语言的差异。

console.log(['Z', 'a', 'z', '?'].sort(new Intl.Collator('de').compare));
// Expected output: Array ["a", "?", "z", "Z"]
console.log(['Z', 'a', 'z', '?'].sort(new Intl.Collator('sv').compare));
// Expected output: Array ["a", "z", "Z", "?"]
console.log(['Z', 'a', 'z', '?'].sort(new Intl.Collator('de', { caseFirst: 'upper' }).compare));
// Expected output: Array ["a", "?", "Z", "z"]

//创建一个Intl.Collator对象
const collator=new Intl.Collator('en-US', { sensitivity: 'base', usage: 'sort' });

// 可以看出以下输出是按照拼音排序, guang jin mei ming tian yang
console.log(['今','天','阳','光', '明', '媚'].sort(new Intl.Collator('zh').compare)); // ['光', '今', '媚', '明', '天', '阳']

可以发现 options可以传递参数usage和sensitivity, 有如下取值

「排序方式 (usage)」: 「usage」 选项指定排序的目的
'sort'：用于排序。
'search'：用于搜索操作，通常不区分大小写。
「敏感性 (sensitivity)」: 「sensitivity」 选项指定字符串比较的敏感性级别
'base'：基本敏感性，不区分重音符号。
'accent'：考虑重音符号，但不区分大小写。
'case'：区分大小写，同时考虑重音符号。
'case'：区分大小写，但不考虑重音符号。
大小写 (caseFirst)
"upper"：表示大写字母（uppercase）在排序中优先考虑。这意味着排序会先考虑所有大写字母，然后再考虑小写字母。在 「caseFirst: "upper"」 情况下，大写字母会排在小写字母之前。
"lower"：表示小写字母（lowercase）在排序中优先考虑。这意味着排序会先考虑所有小写字母，然后再考虑大写字母。在 「caseFirst: "lower"」 情况下，小写字母会排在大写字母之前。
"false"：表示不指定大写字母和小写字母的排序顺序，它们会一起排序，不区分大小写。
ignorePunctuation: boolean, 表示是否忽略标点符号

有如下的应用方式

字符串比较

const collator=new Intl.Collator('en-US', { sensitivity: 'base', usage: 'sort' }); //创建一个Intl.Collator对象
const result=collator.compare('apple', 'Banana');
console.log(result); // 根据配置输出 -1（apple 在 Banana 前面）

数组排序

我们知道英文字母默认按照ASCII排序, 而如果需要AaBb这样排序只能自己写排序回调


// 创建一个自定义Collator对象
const customCollator=new Intl.Collator('en-US', {
  sensitivity: 'base',
  usage: 'sort',
  ignorePunctuation: true,
  caseFirst: 'false',
});
// 自定义比较函数, 忽略空格并不区分大小写
function customCompare(a, b) {
  // 移除字符串中的空格并转为小写后再比较
  const stringA=a.replace(/\\s/g, '').toLowerCase();
  const stringB=b.replace(/\\s/g, '').toLowerCase();
  if (stringA < stringB) {
    return -1;
  }
  if (stringA > stringB) {
    return 1;
  }
  return 0;
}
const data=['Apple', 'banana', 'cherry', 'apple pie', 'Banana Split', 'cherry tart'];
const data2=data.slice()
// 老方式: 使用sort回调排序
console.log(data.sort(customCompare)); // 输出排序结果：['Apple', 'apple pie', 'banana', 'Banana Split', 'cherry', 'cherry tart']
// 新方式: 使用自定义Collator对象进行排序
console.log(data2.sort(customCollator.compare)); // 输出排序结果：['Apple', 'apple pie', 'banana', 'Banana Split', 'cherry', 'cherry tart']

可以发现两种方式结果一样, 但是明显Intl. Collator更加优雅, 是配置化的.

Intl.「ListFormat」

「Intl.ListFormat」 是 JavaScript 的国际化 API 之一，它用于格式化列表，以便在多语言环境中创建自然语言的列表表示。Intl.ListFormat 允许你指定列表项的连接方式（如逗号、"和" 等），以及列表项的样式和语言设置。

const listFormatter=new Intl.ListFormat('en-US', { style: 'long', type: 'disjunction' });
const items=['apples', 'bananas', 'cherries'];
const formattedList=listFormatter.format(items);
console.log(formattedList); // 根据配置输出例如："apples, bananas, or cherries"

const listFormatter=new Intl.ListFormat('en-US', { style: 'short', type: 'conjunction' });
const items=['apples', 'bananas', 'cherries'];
const formattedList=listFormatter.format(items);
console.log(formattedList); // 根据配置输出例如："apples, bananas, & cherries"

const listFormatter=new Intl.ListFormat('en-US', { style: 'narrow', type: 'conjunction' });
const items=['apples', 'bananas', 'cherries'];
const formattedList=listFormatter.format(items);
console.log(formattedList); // 根据配置输出例如："apples, bananas, cherries"

可以发现 options可以传递参数style和type, 有如下取值

「样式 (style)」: 「style」 选项指定列表的样式，有三个可能的值
'long'：使用完整的语言表达，例如 "A, B, and C"。
'short'：使用缩略形式，例如 "A, B, & C"。
'narrow'：使用极简的形式，例如 "A B C"。
「连接方式 (type)」: 「type」 选项指定连接列表项的方式，有两个可能的值
'conjunction'：使用 "和"（默认值），例如 "A、B和C"。
'disjunction'：使用 "或"，例如 "A、B或C"。

日历

日历是一种常见的东西, 在中国我们经常接触到公历和农历，公历全称格里高利历, 英文gregory。

现在国家节日很多都是跟随农历的，比如春节，中秋节等。以前家家人手一本农历, 上面会今日宜做什么, 现在很少见但是老人家还是信这个。

而与此相同, 每个地方都有自己的历法

「伊斯兰历（Hijri Calendar）」：也称为伊斯兰农历，是伊斯兰教的官方日历，基于月亮的循环。伊斯兰历的年份比公历年份短，每年有12个月，因此季节日期会变化。
「希伯来历（Hebrew Calendar）」：希伯来历是犹太教的官方日历，基于太阳和月亮的周期。它有13个月，其中一些月份可以有不同的天数，以保持与农历季节的一致性。
「农历（Lunar Calendar）」：农历基于月亮的循环，不同地区和文化有不同的农历系统，如中国农历、韩国农历、越南农历等。
「波斯历（Persian Calendar）」：波斯历，也称波斯太阳历，是伊朗和一些邻近国家使用的太阳历，与公历有一些差异。
「印度历法（Indian Calendar）」：印度有多种历法，包括维基历（Vikram Samvat）、国民历法（Saka Samvat）、泰米尔历法（Tamil Calendar）等。
「巴哈伊历（Bahá'í Calendar）」：巴哈伊信仰使用的独特日历，包括19个月，每个月19天。
「民族历法」：一些文化和民族拥有自己的独特历法，用于纪念特定历史事件和节日。

const date=new Date(); // 当前日期, Mon Oct 30 2023 20:00:50 GMT+0800 (中国标准时间)
const formattedDate=new Intl.DateTimeFormat('ar-SA-u-ca-islamic', { year: 'numeric', month: 'long', day: 'numeric' }).format(date);
console.log(formattedDate); // ?? ???? ????? ???? ??

const date=new Date(1994, 1, 26);
const formattedDate=new Intl.DateTimeFormat('zh-CN-u-ca-chinese', { year: 'numeric', month: 'long', day: 'numeric' }).format(date);
console.log(formattedDate); // 1994甲戌年正月17

// 不同语言下不同日历的名称
const calendarNames=new Intl.DisplayNames('en-US', { type: 'calendar' });
console.log(calendarNames.of('gregory')); // 输出：Gregorian
console.log(calendarNames.of('islamic')); // 输出：Islamic

const calendarNames=new Intl.DisplayNames('zh-CN', { type: 'calendar' });
console.log(calendarNames.of('gregory')); // 输出：公历
console.log(calendarNames.of('islamic')); // 输出：伊斯兰历

除了上述gregory格里高利历, 取值还有下面这些

"buddhist"：佛教历（Buddhist Calendar）
"chinese"：中国农历（Chinese Lunar Calendar）
"coptic"：科普特历（Coptic Calendar）
"ethiopic"：埃塞俄比亚历（Ethiopic Calendar）
"gregory"：格里高利历（Gregorian Calendar，即公历）
"hebrew"：希伯来历（Hebrew Calendar）
"indian"：印度历法（Indian Calendar），包括维基历（Vikram Samvat）等
"islamic"：伊斯兰历（Islamic Calendar）
"persian"：波斯历（Persian Calendar）
"islamic-civil"：伊斯兰历的公民版本（Islamic Civil Calendar），通常用于文书、合同等民事事务

星期

不得不说日本的和历, 真的是很神奇. 不是中文的周一到周日, 也不是Sunday-Saturday. 首先日本还有皇帝, 有皇帝就有年号. 常见下面的一些年份

「明治时代 (Meiji Era)」:
年号：明治（Meiji）
年份范围：1868年 - 1912年
注释：明治时代标志着日本的近代化和工业化的开始。
「大正时代 (Taisho Era)」:
年号：大正（Taisho）
年份范围：1912年 - 1926年
注释：大正时代是日本的一个相对短暂时期，也标志着日本的一些政治和社会变革。
「昭和时代 (Showa Era)」:
年号：昭和（Showa）
年份范围：1926年 - 1989年
注释：昭和时代见证了日本的战争和战后重建，以及日本成为现代工业强国。
「平成时代 (Heisei Era)」:
年号：平成（Heisei）
年份范围：1989年 - 2019年
注释：平成时代包括了日本的经济繁荣和一些社会变革。
「令和时代 (Reiwa Era)」: 令和系奥特曼(https://baike.baidu.com/item/%E4%BB%A4%E5%92%8C%E7%B3%BB%E5%A5%A5%E7%89%B9%E6%9B%BC/50340521)
年号：令和（Reiwa）
年份范围：2019年 - 至今
注释：令和时代是日本当前的年号，始于2019年5月1日，标志着新的时代的开始。

可以看到日本的日历起始时周日(日), 但是周一到周六分别对应月火水木金土. 与众不同

// 获取今天过去7天的日期
// 日期对象
const today=new Date();

// 创建一个选项对象，指定输出的语言和风格
const optionsCN={ weekday: 'long' };
const optionsJP={ weekday: 'long' };

// 获取过去一周的日期
console.log('\n过去一周的日期：');
const CNArr=[]
const JPArr=[]
for (let i=0; i < 7; i++) {
  const pastDate=new Date(today);
  pastDate.setDate(today.getDate() - i);
  CNArr.unshift(new Intl.DateTimeFormat('zh-CN', optionsCN).format(pastDate))
 JPArr.unshift(new Intl.DateTimeFormat('ja-JP', optionsJP).format(pastDate))
}
console.log('CNArr', CNArr.join(' ')) // CNArr 星期二 星期三 星期四 星期五 星期六 星期日 星期一
console.log('JPArr', JPArr.join(' ')) // JPArr 火曜日 水曜日 木曜日 金曜日 土曜日 日曜日 月曜日

日历单位

const dateTimeFields=new Intl.DisplayNames('en-US', { type: 'dateTimeField' });
console.log(dateTimeFields.of('era')); // 输出：Era, 纪元的意思
console.log(dateTimeFields.of('year')); // 输出：Year
console.log(dateTimeFields.of('month')); // 输出：Month
console.log(dateTimeFields.of('day')); // 输出：Day
console.log(dateTimeFields.of('weekday')); // 输出：Day of the week
console.log(dateTimeFields.of('hour')); // 输出：Hour
console.log(dateTimeFields.of('minute')); // 输出：Minute
console.log(dateTimeFields.of('second')); // 输出：Second
console.log(dateTimeFields.of('quarter')); // 输出：Quarter

const dateTimeFields=new Intl.DisplayNames('ja-JP', { type: 'dateTimeField' });
console.log(dateTimeFields.of('era')); // 输出：時代
console.log(dateTimeFields.of('year')); // 输出：年
console.log(dateTimeFields.of('month')); // 输出：月
console.log(dateTimeFields.of('day')); // 输出：日
console.log(dateTimeFields.of('weekday')); // 输出：曜日
console.log(dateTimeFields.of('hour')); // 输出：時
console.log(dateTimeFields.of('minute')); // 输出：分
console.log(dateTimeFields.of('second')); // 输出：秒
console.log(dateTimeFields.of('quarter')); // 输出：四半期

每周第一天

en-US一般每周第一天是周日, 而zh-CN一般每周第一天是周一. 可以通过如下信息判断

Intl. Locale函数返回属性里firstDay是一个数字，其中 0或7 表示星期日，1 表示星期一，依此类推。不同的地区和文化可能会将每周的第一天设置为不同的日期，因此这个属性可以帮助你确定每周的起始日期，例如，星期天或星期一。

(new Intl.Locale('zh-CN')).weekInfo // {"firstDay":1,"weekend":[6,7],"minimalDays":1}
(new Intl.Locale('en-US')).weekInfo // {"firstDay":7,"weekend":[6,7],"minimalDays":1}

const locale='zh-CN'
console.log(new Intl.DateTimeFormat(locale, { weekday: "long" }).format(new Date(0, 0, new Intl.Locale(locale).weekInfo.firstDay))) // '星期一'

const locale='en-US'
console.log(new Intl.DateTimeFormat(locale, { weekday: "long" }).format(new Date(0, 0, new Intl.Locale(locale).weekInfo.firstDay))) // 'Sunday'

能生效的原因就是 new Date(0,0,7) 等价于 new Date(1900,0,7) 对应1900.1.7(UTC+0), 此时对应us的sunday. 而 new Date(0,0,1) 对应1900.1.1(UTC+0), 对应cn的星期一

如何写一个国际化日历组件

zh-CN下是星期一开头
en-US是Sunday开头
日语环境符合火曜日水曜日木曜日金曜日土曜日日曜日月曜日等描述

Intl-region-language-calendar：https://codesandbox.io/p/devbox/intl-region-language-calendar-xm7ls8?embed=1&file=%2Fsrc%2FApp.tsx

实现细节

如何获取一周的第一天

const localeUS='en-US'
new Intl.DateTimeFormat(localeUS, { weekday: "long" }).format(new Date(0, 0, new Intl.Locale(localeUS).weekInfo.firstDay)) // 'Sunday'

const localeCN='zh-CN'
new Intl.DateTimeFormat(localeCN, { weekday: "long" }).format(new Date(0, 0, new Intl.Locale(localeCN).weekInfo.firstDay)) // '星期一'

输出一周7天

const locale="en-US";
const firstDay=new Intl.Locale(locale).weekInfo.firstDay;
const formatInstace=new Intl.DateTimeFormat(locale, { weekday: "long" });
for (let i=0; i < 7; i++) {
  console.log(formatInstace.format(new Date(0, 0, firstDay + i)));
}
// Sunday
// Monday
// Tuesday
// Wednesday
// Thursday
// Friday
// Saturday

const localeCN="zh-CN";
const firstDayCN=new Intl.Locale(localeCN).weekInfo.firstDay;
const formatInstaceCN=new Intl.DateTimeFormat(localeCN, { weekday: "long" });
for (let i=0; i < 7; i++) {
  console.log(formatInstaceCN.format(new Date(0, 0, firstDayCN + i)));
}
// 星期一
// 星期二
// 星期三
// 星期四
// 星期五
// 星期六
// 星期日

Intl支持

支持查询

可以通过API, Intl.supportedValuesOf 获取到所有支持的

const calendars=Intl.supportedValuesOf("calendar");
console.log(calendars); // 输出所有支持的日历系统
// (18) ['buddhist', 'chinese', 'coptic', 'dangi', 'ethioaa', 'ethiopic', 'gregory', 'hebrew', 'indian', 'islamic', 'islamic-civil', 'islamic-rgsa', 'islamic-tbla', 'islamic-umalqura', 'iso8601', 'japanese', 'persian', 'roc']

const currencies=Intl.supportedValuesOf("currency");
console.log(currencies); // 输出所有支持的货币代码
// (159) ['AED', 'AFN', 'ALL', 'AMD', 'ANG', 'AOA', 'ARS', 'AUD', 'AWG', 'AZN', 'BAM', 'BBD', 'BDT', 'BGN', 'BHD', 'BIF', 'BMD', 'BND', 'BOB', 'BRL', 'BSD', 'BTN', 'BWP', 'BYN', 'BZD', 'CAD', 'CDF', 'CHF', 'CLP', 'CNY', 'COP', 'CRC', 'CUC', 'CUP', 'CVE', 'CZK', 'DJF', 'DKK', 'DOP', 'DZD', 'EGP', 'ERN', 'ETB', 'EUR', 'FJD', 'FKP', 'GBP', 'GEL', 'GHS', 'GIP', 'GMD', 'GNF', 'GTQ', 'GYD', 'HKD', 'HNL', 'HRK', 'HTG', 'HUF', 'IDR', 'ILS', 'INR', 'IQD', 'IRR', 'ISK', 'JMD', 'JOD', 'JPY', 'KES', 'KGS', 'KHR', 'KMF', 'KPW', 'KRW', 'KWD', 'KYD', 'KZT', 'LAK', 'LBP', 'LKR', 'LRD', 'LSL', 'LYD', 'MAD', 'MDL', 'MGA', 'MKD', 'MMK', 'MNT', 'MOP', 'MRU', 'MUR', 'MVR', 'MWK', 'MXN', 'MYR', 'MZN', 'NAD', 'NGN', 'NIO', …]

const timeZones=Intl.supportedValuesOf("timeZone");
console.log(timeZones); // 输出所有支持的时区
// (428) ['Africa/Abidjan', 'Africa/Accra', 'Africa/Addis_Ababa', 'Africa/Algiers', 'Africa/Asmera', 'Africa/Bamako', 'Africa/Bangui', 'Africa/Banjul', 'Africa/Bissau', 'Africa/Blantyre', 'Africa/Brazzaville', 'Africa/Bujumbura', 'Africa/Cairo', 'Africa/Casablanca', 'Africa/Ceuta', 'Africa/Conakry', 'Africa/Dakar', 'Africa/Dar_es_Salaam', 'Africa/Djibouti', 'Africa/Douala', 'Africa/El_Aaiun', 'Africa/Freetown', 'Africa/Gaborone', 'Africa/Harare', 'Africa/Johannesburg', 'Africa/Juba', 'Africa/Kampala', 'Africa/Khartoum', 'Africa/Kigali', 'Africa/Kinshasa', 'Africa/Lagos', 'Africa/Libreville', 'Africa/Lome', 'Africa/Luanda', 'Africa/Lubumbashi', 'Africa/Lusaka', 'Africa/Malabo', 'Africa/Maputo', 'Africa/Maseru', 'Africa/Mbabane', 'Africa/Mogadishu', 'Africa/Monrovia', 'Africa/Nairobi', 'Africa/Ndjamena', 'Africa/Niamey', 'Africa/Nouakchott', 'Africa/Ouagadougou', 'Africa/Porto-Novo', 'Africa/Sao_Tome', 'Africa/Tripoli', 'Africa/Tunis', 'Africa/Windhoek', 'America/Adak', 'America/Anchorage', 'America/Anguilla', 'America/Antigua', 'America/Araguaina', 'America/Argentina/La_Rioja', 'America/Argentina/Rio_Gallegos', 'America/Argentina/Salta', 'America/Argentina/San_Juan', 'America/Argentina/San_Luis', 'America/Argentina/Tucuman', 'America/Argentina/Ushuaia', 'America/Aruba', 'America/Asuncion', 'America/Bahia', 'America/Bahia_Banderas', 'America/Barbados', 'America/Belem', 'America/Belize', 'America/Blanc-Sablon', 'America/Boa_Vista', 'America/Bogota', 'America/Boise', 'America/Buenos_Aires', 'America/Cambridge_Bay', 'America/Campo_Grande', 'America/Cancun', 'America/Caracas', 'America/Catamarca', 'America/Cayenne', 'America/Cayman', 'America/Chicago', 'America/Chihuahua', 'America/Ciudad_Juarez', 'America/Coral_Harbour', 'America/Cordoba', 'America/Costa_Rica', 'America/Creston', 'America/Cuiaba', 'America/Curacao', 'America/Danmarkshavn', 'America/Dawson', 'America/Dawson_Creek', 'America/Denver', 'America/Detroit', 'America/Dominica', 'America/Edmonton', 'America/Eirunepe', …]

低版本兼容

Intl是浏览器对i18n提供的底层API, 用于处理国际化相关内容. 附带browerstack 云真机测试工具(caniuse推荐): https://live.browserstack.com/dashboard如果没处理好兼容性问题直接使用API, 会报JS Error. 内容为Intl.DisplayNames is not a constructor

对应的, 一些操作系统低版本的用户(长期不升级系统)会遇到JS Error导致白屏

Google pixel4(2019, October 15发行)

formatjs

可以用formatjs提供的polyfill做低版本兼容: https://formatjs.io/docs/getting-started/installation

支持关系有先后依赖, 需要按照依赖顺序依次倒入对应的包
移动端场景如小程序, 如果全量导入移动端场景会让包体积爆炸(普通压缩包Gzip后2.6M, 全量导入后会到9.1M). 最后polyfill按照顺序依次导入, 且移动端场景只兜底英文部分, 限制包体积在3M内

async function loadPolyfill() {
    // 如果当前环境不支持 Intl 或者 Intl.DisplayNames
    if (!window.Intl || !window.Intl.DisplayNames) {
        window.Intl=window.Intl || {}
        // 加载 polyfill
        await import('@formatjs/intl-getcanonicallocales/polyfill-force')
        await import('@formatjs/intl-locale/polyfill-force')
        await import('@formatjs/intl-displaynames/polyfill-force')
        await import('@formatjs/intl-displaynames/locale-data/en')
        return false
    } else {
        // 当前环境支持 Intl.DisplayNames API，不需要 Polyfill
        return true
    }
}

Why not Babel?

众所周知Babel有一个babel-preset-env(https://www.babeljs.cn/docs/babel-preset-env#how-does-it-work), 用于在编译代码时智能(基于core-js-compat(https://www.npmjs.com/package/core-js-compat))引入helper和polyfill 智能的含义: 可以设置最低兼容的浏览器(https://github.com/browserslist/browserslist#queries)和代码, 动态引用所需的helper和polyfill

// babel.config.js
module.exports={
  presets: [
    [
      '@babel/preset-env',
      {
        useBuiltIns: 'usage', // 根据每个文件里面，用到了哪些es的新特性和targets导入polyfill，更加精简
        corejs: 3, // 指定 core-js 版本
        targets: "> 0.25%, not dead" // 指定目标浏览器, 选取全球使用率超过 0.25% 的浏览器版本
      },
    ],
  ],
};

「babel底层使用core-js(https://github.com/zloirock/core-js)进行polyfill, 但是core-js不包含Intl API部分的polyfill(https://github.com/zloirock/core-js?tab=readme-ov-file#missing-polyfills), 所以babel并不能为Intl API做polyfill」

Nodejs使用

安装

npm i @formatjs/intl

使用

import {createIntl, createIntlCache} from '@formatjs/intl'

// This is optional but highly recommended
// since it prevents memory leak
const cache=createIntlCache()

const intlFr=createIntl(
  {
    locale: 'fr-FR',
    messages: {},
  },
  cache
)
const intlEn=createIntl(
  {
    locale: 'en-US',
    message: {},
    cache
  }
)

// Call imperatively
console.log(intlFr.formatNumber(2000000000000)) // 2 000 000 000 000
console.log(intlEn.formatNumber(2000000000000)) // 2,000,000,000,000

作者:ManfredHu

来源-微信公众号:字节前端 ByteFE

出处:https://mp.weixin.qq.com/s/PByp6Pmc3vp7b0acyPT8yA

节跳动面试官：请你实现一个大文件上传和断点续传

原作者：yeyan1996

原文链接：https://url.cn/5h66afn

前言

这段时间面试官都挺忙的，频频出现在博客文章标题，虽然我不是特别想蹭热度，但是实在想不到好的标题了-。-，蹭蹭就蹭蹭 :)

事实上我在面试的时候确实被问到了这个问题，而且是一道在线 coding 的编程题，当时虽然思路正确，可惜最终也并不算完全答对。

结束后花了一段时间整理了下思路，那么究竟该如何实现一个大文件上传，以及在上传中如何实现断点续传的功能呢？

本文将从零搭建前端和服务端，实现一个大文件上传和断点续传的 demo：

前端：vue element-ui
服务端：nodejs

文章有误解的地方，欢迎指出，将在第一时间改正，有更好的实现方式希望留下你的评论。

大文件上传

前端

前端大文件上传网上的大部分文章已经给出了解决方案，核心是利用 Blob.prototype.slice 方法，此方法和数组的 slice 方法相似，调用的 slice 方法可以返回原文件的某个切片。

这样我们就可以根据预先设置好的切片最大数量将文件切分为一个个切片，然后借助 http 的可并发性，同时上传多个切片，这样从原本传一个大文件，变成了同时传多个小的文件切片，可以大大减少上传时间。

另外由于是并发，传输到服务端的顺序可能会发生变化，所以我们还需要给每个切片记录顺序。

服务端

服务端需要负责接受这些切片，并在接收到所有切片后合并切片。

这里又引伸出两个问题：

何时合并切片，即切片什么时候传输完成？
如何合并切片？

第一个问题需要前端进行配合，前端在每个切片中都携带切片最大数量的信息，当服务端接收到这个数量的切片时自动合并，也可以额外发一个请求主动通知服务端进行切片的合并。

第二个问题，具体如何合并切片呢？这里可以使用 NodeJS 的 API fs.appendFileSync，它可以同步地将数据追加到指定文件，也就是说，当服务端接收完所有切片后，可以先创建一个空文件，然后将所有切片逐步合并到这个文件中。

so，talk is cheap, show me the code，接着让我们用代码实现上面的思路吧。

前端部分

前端使用 Vue 作为开发框架，对界面没有太大要求，原生也可以，考虑到美观使用 Element-UI 作为 UI 框架。

上传控件

首先创建选择文件的控件，监听 change 事件以及上传按钮：

<template>
   <div>
    <input type="file" @change="handleFileChange" />
    <el-button @click="handleUpload">上传</el-button>
  </div>
</template>

<script>
export default {
  data: ()=> ({
    container: {
      file: null
    }
  }),
  methods: {
     handleFileChange(e) {
      const [file]=e.target.files;
      if (!file) return;
      Object.assign(this.$data, this.$options.data());
      this.container.file=file;
    },
    async handleUpload() {}
  }
};
</script>

请求逻辑

考虑到通用性，这里没有用第三方的请求库，而是用原生 XMLHttpRequest 做一层简单的封装来发请求：

request({
    url,
    method="post",
    data,
    headers={},
    requestList
}) {
    return new Promise(resolve=> {
        const xhr=new XMLHttpRequest();
        xhr.open(method, url);
        Object.keys(headers).forEach(key=>
            xhr.setRequestHeader(key, headers[key])
        );
        xhr.send(data);
        xhr.onload=e=> {
            resolve({
                data: e.target.response
            });
        };
    });
}

上传切片

接着实现比较重要的上传功能，上传需要做两件事：

对文件进行切片
将切片传输给服务端

<template>
  <div>
    <input type="file" @change="handleFileChange" />
    <el-button @click="handleUpload">上传</el-button>
  </div>
</template>

<script>
+ const LENGTH=10; // 切片数量

export default {
  data: ()=> ({
    container: {
      file: null,
+     data: []
    }
  }),
  methods: {
    request() {},
    handleFileChange() {},
+    // 生成文件切片
+    createFileChunk(file, length=LENGTH) {
+      const fileChunkList=[];
+      const chunkSize=Math.ceil(file.size / length);
+      let cur=0;
+      while (cur < file.size) {
+        fileChunkList.push({ file: file.slice(cur, cur + chunkSize) });
+        cur +=chunkSize;
+      }
+      return fileChunkList;
+    },
+   // 上传切片
+    async uploadChunks() {
+      const requestList=this.data
+        .map(({ chunk })=> {
+          const formData=new FormData();
+          formData.append("chunk", chunk);
+					 formData.append("hash", hash);
+          formData.append("filename", this.container.file.name);
+          return { formData };
+        })
+        .map(async ({ formData })=>
+          this.request({
+            url: "http://localhost:3000",
+            data: formData
+          })
+        );
+      await Promise.all(requestList); // 并发切片
+    },
+    async handleUpload() {
+      if (!this.container.file) return;
+      const fileChunkList=this.createFileChunk(this.container.file);
+      this.data=fileChunkList.map(({ file }，index)=> ({
+        chunk: file,
+        hash: this.container.file.name + "-" + index // 文件名 + 数组下标
+      }));
+      await this.uploadChunks();
+    }
  }
};
</script>

当点击上传按钮时，调用 createFileChunk 将文件切片，切片数量通过一个常量 Length 控制，这里设置为 10，即将文件分成 10 个切片上传。

createFileChunk 内使用 while 循环和 slice 方法将切片放入 fileChunkList 数组中返回。

在生成文件切片时，需要给每个切片一个标识作为 hash，这里暂时使用文件名 + 下标，这样后端可以知道当前切片是第几个切片，用于之后的合并切片。

随后调用 uploadChunks 上传所有的文件切片，将文件切片，切片 hash，以及文件名放入 FormData 中，再调用上一步的 request 函数返回一个 proimise，最后调用 Promise.all 并发上传所有的切片。

发送合并请求

这里使用整体思路中提到的第二种合并切片的方式，即前端主动通知服务端进行合并，所以前端还需要额外发请求，服务端接受到这个请求时主动合并切片

<template>
  <div>
    <input type="file" @change="handleFileChange" />
    <el-button @click="handleUpload">上传</el-button>
  </div>
</template>

<script>
export default {
  data: ()=> ({
    container: {
      file: null
    },
    data: []
  }),
  methods: {
    request() {},
    handleFileChange() {},
    createFileChunk() {},
    // 上传切片，同时过滤已上传的切片
    async uploadChunks() {
      const requestList=this.data
        .map(({ chunk })=> {
          const formData=new FormData();
          formData.append("chunk", chunk);
          formData.append("hash", hash);
          formData.append("filename", this.container.file.name);
          return { formData };
        })
        .map(async ({ formData })=>
          this.request({
            url: "http://localhost:3000",
            data: formData
          })
        );
      await Promise.all(requestList);
+      // 合并切片
+     await this.mergeRequest();
    },
+    async mergeRequest() {
+      await this.request({
+        url: "http://localhost:3000/merge",
+        headers: {
+          "content-type": "application/json"
+        },
+        data: JSON.stringify({
+          filename: this.container.file.name
+        })
+      });
+    },
    async handleUpload() {}
  }
};
</script>

服务端部分

简单使用 HTTP 模块搭建服务端：

const http=require("http");
const server=http.createServer();

server.on("request", async (req, res)=> {
  res.setHeader("Access-Control-Allow-Origin", "*");
  res.setHeader("Access-Control-Allow-Headers", "*");
  if (req.method==="OPTIONS") {
    res.status=200;
    res.end();
    return;
  }
});

server.listen(3000, ()=> console.log("正在监听 3000 端口"));

接受切片

使用 multiparty 包处理前端传来的 FormData，在 multiparty.parse 的回调中，files 参数保存了 FormData 中文件，fields 参数保存了 FormData 中非文件的字段：

const http=require("http");
const path=require("path");
const fse=require("fs-extra");
const multiparty=require("multiparty");

const server=http.createServer();
+ const UPLOAD_DIR=path.resolve(__dirname, "..", "target"); // 大文件存储目录

server.on("request", async (req, res)=> {
  res.setHeader("Access-Control-Allow-Origin", "*");
  res.setHeader("Access-Control-Allow-Headers", "*");
  if (req.method==="OPTIONS") {
    res.status=200;
    res.end();
    return;
  }

+  const multipart=new multiparty.Form();

+  multipart.parse(req, async (err, fields, files)=> {
+    if (err) {
+      return;
+    }
+    const [chunk]=files.chunk;
+    const [hash]=fields.hash;
+    const [filename]=fields.filename;
+    const chunkDir=`${UPLOAD_DIR}/${filename}`;

+   // 切片目录不存在，创建切片目录
+    if (!fse.existsSync(chunkDir)) {
+      await fse.mkdirs(chunkDir);
+    }

+      // fs-extra 专用方法，类似 fs.rename 并且跨平台
+      // fs-extra 的 rename 方法 windows 平台会有权限问题
+      // https://github.com/meteor/meteor/issues/7852#issuecomment-255767835
+      await fse.move(chunk.path, `${chunkDir}/${hash}`);
+    res.end("received file chunk");
+  });
});

server.listen(3000, ()=> console.log("正在监听 3000 端口"));

查看 multiparty 处理后的 chunk 对象，path 是存储临时文件的路径，size 是临时文件大小，在 multiparty 文档中提到可以使用 fs.rename(由于我用的是 fs-extra，其 rename 方法在 Windows 系统上存在权限问题，所以换成了 fse.move) 重命名的方式移动临时文件，也就是文件切片。

在接受文件切片时，需要先创建存储切片的文件夹，由于前端在发送每个切片时额外携带了唯一值 hash，所以以 hash 作为文件名，将切片从临时路径移动切片文件夹中，最后的结果如下

合并切片

在接收到前端发送的合并请求后，服务端将文件夹下的所有切片进行合并

const http=require("http");
const path=require("path");
const fse=require("fs-extra");

const server=http.createServer();
const UPLOAD_DIR=path.resolve(__dirname, "..", "target"); // 大文件存储目录

+ const resolvePost=req=>
+   new Promise(resolve=> {
+     let chunk="";
+     req.on("data", data=> {
+       chunk +=data;
+     });
+     req.on("end", ()=> {
+       resolve(JSON.parse(chunk));
+     });
+   });

+ // 合并切片
+ const mergeFileChunk=async (filePath, filename)=> {
+   const chunkDir=`${UPLOAD_DIR}/${filename}`;
+   const chunkPaths=await fse.readdir(chunkDir);
+   await fse.writeFile(filePath, "");
+   chunkPaths.forEach(chunkPath=> {
+     fse.appendFileSync(filePath, fse.readFileSync(`${chunkDir}/${chunkPath}`));
+     fse.unlinkSync(`${chunkDir}/${chunkPath}`);
+   });
+   fse.rmdirSync(chunkDir); // 合并后删除保存切片的目录
+ };

server.on("request", async (req, res)=> {
  res.setHeader("Access-Control-Allow-Origin", "*");
  res.setHeader("Access-Control-Allow-Headers", "*");
  if (req.method==="OPTIONS") {
    res.status=200;
    res.end();
    return;
  }

+   if (req.url==="/merge") {
+     const data=await resolvePost(req);
+     const { filename }=data;
+     const filePath=`${UPLOAD_DIR}/${filename}`;
+     await mergeFileChunk(filePath, filename);
+     res.end(
+       JSON.stringify({
+         code: 0,
+         message: "file merged success"
+       })
+     );
+   }

});

server.listen(3000, ()=> console.log("正在监听 3000 端口"));

由于前端在发送合并请求时会携带文件名，服务端根据文件名可以找到上一步创建的切片文件夹。

接着使用 fs.writeFileSync 先创建一个空文件，这个空文件的文件名就是切片文件夹名 + 后缀名组合而成，随后通过 fs.appendFileSync 从切片文件夹中不断将切片合并到空文件中，每次合并完成后删除这个切片，等所有切片都合并完毕后最后删除切片文件夹。

至此一个简单的大文件上传就完成了，接下来我们再此基础上扩展一些额外的功能。

显示上传进度条

上传进度分两种，一个是每个切片的上传进度，另一个是整个文件的上传进度，而整个文件的上传进度是基于每个切片上传进度计算而来，所以我们需要先实现切片的上传进度。

切片进度条

XMLHttpRequest 原生支持上传进度的监听，只需要监听 upload.onprogress 即可，我们在原来的 request 基础上传入 onProgress 参数，给 XMLHttpRequest 注册监听事件：

 // xhr
    request({
      url,
      method="post",
      data,
      headers={},
+      onProgress=e=> e,
      requestList
    }) {
      return new Promise(resolve=> {
        const xhr=new XMLHttpRequest();
+        xhr.upload.onprogress=onProgress;
        xhr.open(method, url);
        Object.keys(headers).forEach(key=>
          xhr.setRequestHeader(key, headers[key])
        );
        xhr.send(data);
        xhr.onload=e=> {
          resolve({
            data: e.target.response
          });
        };
      });
    }

由于每个切片都需要触发独立的监听事件，所以还需要一个工厂函数，根据传入的切片返回不同的监听函数。

在原先的前端上传逻辑中新增监听函数部分：

    // 上传切片，同时过滤已上传的切片
    async uploadChunks(uploadedList=[]) {
      const requestList=this.data
        .map(({ chunk })=> {
          const formData=new FormData();
          formData.append("chunk", chunk);
          formData.append("filename", this.container.file.name);
          return { formData };
        })
        .map(async ({ formData })=>
          this.request({
            url: "http://localhost:3000",
            data: formData，
+           onProgress: this.createProgressHandler(this.data[index]),
          })
        );
      await Promise.all(requestList);
       // 合并切片
      await this.mergeRequest();
    },
    async handleUpload() {
      if (!this.container.file) return;
      const fileChunkList=this.createFileChunk(this.container.file);
      this.data=fileChunkList.map(({ file }，index)=> ({
        chunk: file,
+       index,
        hash: this.container.file.name + "-" + index
+       percentage:0
      }));
      await this.uploadChunks();
    }
+   createProgressHandler(item) {
+      return e=> {
+        item.percentage=parseInt(String((e.loaded / e.total) * 100));
+      };
+    }

每个切片在上传时都会通过监听函数更新 data 数组对应元素的 percentage 属性，之后把将 data 数组放到视图中展示即可。

文件进度条

将每个切片已上传的部分累加，除以整个文件的大小，就能得出当前文件的上传进度，所以这里使用 Vue 计算属性：

computed: {
     uploadPercentage() {
        if (!this.container.file || !this.data.length) return 0;
        const loaded=this.data
          .map(item=> item.size * item.percentage)
          .reduce((acc, cur)=> acc + cur);
        return parseInt((loaded / this.container.file.size).toFixed(2));
      }
}

最终效果如下：

断点续传

断点续传的原理在于前端/服务端需要记住已上传的切片，这样下次上传就可以跳过之前已上传的部分，有两种方案实现记忆的功能：

前端使用 localStorage 记录已上传的切片 hash。
服务端保存已上传的切片 hash，前端每次上传前向服务端获取已上传的切片。

第一种是前端的解决方案，第二种是服务端，而前端方案有一个缺陷，如果换了个浏览器就失去了记忆的效果，所以这里选取后者。

生成 hash

无论是前端还是服务端，都必须要生成文件和切片的 hash，之前我们使用文件名 + 切片下标作为切片 hash，这样做文件名一旦修改就失去了效果，而事实上只要文件内容不变，hash 就不应该变化，所以正确的做法是根据文件内容生成 hash，所以我们需要修改一下 hash 的生成规则。

这里用到另一个库 spark-md5，它可以根据文件内容计算出文件的 hash 值，另外考虑到如果上传一个超大文件，读取文件内容计算 hash 是非常耗费时间的，并且会引起 UI 的阻塞，导致页面假死状态，所以我们使用 web-worker 在 worker 线程计算 hash，这样用户仍可以在主界面正常的交互。

由于实例化 web-worker 时，参数是一个 JavaScript 文件路径，且不能跨域。所以我们单独创建一个 hash.js 文件放在 public 目录下，另外在 worker 中也是不允许访问 DOM 的，但它提供了importScripts 函数用于导入外部脚本，通过它导入 spark-md5。

// /public/hash.js
self.importScripts("/spark-md5.min.js"); // 导入脚本

// 生成文件 hash
self.onmessage=e=> {
  const { fileChunkList }=e.data;
  const spark=new self.SparkMD5.ArrayBuffer();
  let percentage=0;
  let count=0;
  const loadNext=index=> {
    const reader=new FileReader();
    reader.readAsArrayBuffer(fileChunkList[index].file);
    reader.onload=e=> {
      count++;
      spark.append(e.target.result);
      if (count===fileChunkList.length) {
        self.postMessage({
          percentage: 100,
          hash: spark.end()
        });
        self.close();
      } else {
        percentage +=100 / fileChunkList.length;
        self.postMessage({
          percentage
        });
        // 递归计算下一个切片
        loadNext(count);
      }
    };
  };
  loadNext(0);
};

在 worker 线程中，接受文件切片 fileChunkList，利用 FileReader 读取每个切片的 ArrayBuffer 并不断传入 spark-md5 中，每计算完一个切片通过 postMessage 向主线程发送一个进度事件，全部完成后将最终的 hash 发送给主线程。

spark-md5 需要根据所有切片才能算出一个 hash 值，不能直接将整个文件放入计算，否则即使不同文件也会有相同的 hash，具体可以看官方文档。

spark-md5[1]

接着编写主线程与 worker 线程通讯的逻辑

+	   // 生成文件 hash（web-worker）
+    calculateHash(fileChunkList) {
+      return new Promise(resolve=> {
+       // 添加 worker 属性
+        this.container.worker=new Worker("/hash.js");
+        this.container.worker.postMessage({ fileChunkList });
+        this.container.worker.onmessage=e=> {
+          const { percentage, hash }=e.data;
+          this.hashPercentage=percentage;
+          if (hash) {
+            resolve(hash);
+          }
+        };
+      });
    },
    async handleUpload() {
      if (!this.container.file) return;
      const fileChunkList=this.createFileChunk(this.container.file);
+     this.container.hash=await this.calculateHash(fileChunkList);
      this.data=fileChunkList.map(({ file }，index)=> ({
+       fileHash: this.container.hash,
        chunk: file,
        hash: this.container.file.name + "-" + index, // 文件名 + 数组下标
        percentage:0
      }));
      await this.uploadChunks();
    }

主线程使用 postMessage 给 worker 线程传入所有切片 fileChunkList，并监听 worker 线程发出的 postMessage 事件拿到文件 hash。

加上显示计算 hash 的进度条，看起来像这样

至此前端需要将之前用文件名作为 hash 的地方改写为 workder 返回的这个 hash。

服务端则使用 hash 作为切片文件夹名，hash + 下标作为切片名，hash + 扩展名作为文件名，没有新增的逻辑。

文件秒传

在实现断点续传前先简单介绍一下文件秒传。

所谓的文件秒传，即在服务端已经存在了上传的资源，所以当用户再次上传时会直接提示上传成功

文件秒传需要依赖上一步生成的 hash，即在上传前，先计算出文件 hash，并把 hash 发送给服务端进行验证，由于 hash 的唯一性，所以一旦服务端能找到 hash 相同的文件，则直接返回上传成功的信息即可。

+    async verifyUpload(filename, fileHash) {
+       const { data }=await this.request({
+         url: "http://localhost:3000/verify",
+         headers: {
+           "content-type": "application/json"
+         },
+         data: JSON.stringify({
+           filename,
+           fileHash
+         })
+       });
+       return JSON.parse(data);
+     },
   async handleUpload() {
      if (!this.container.file) return;
      const fileChunkList=this.createFileChunk(this.container.file);
      this.container.hash=await this.calculateHash(fileChunkList);
+     const { shouldUpload }=await this.verifyUpload(
+       this.container.file.name,
+       this.container.hash
+     );
+     if (!shouldUpload) {
+       this.$message.success("秒传：上传成功");
+       return;
+    }
     this.data=fileChunkList.map(({ file }, index)=> ({
        fileHash: this.container.hash,
        index,
        hash: this.container.hash + "-" + index,
        chunk: file,
        percentage: 0
      }));
      await this.uploadChunks();
    }

秒传其实就是给用户看的障眼法，实质上根本没有上传。就像下面这行代码 :)

服务端的逻辑非常简单，新增一个验证接口，验证文件是否存在即可。

+ const extractExt=filename=>
+  filename.slice(filename.lastIndexOf("."), filename.length); // 提取后缀名
const UPLOAD_DIR=path.resolve(__dirname, "..", "target"); // 大文件存储目录

const resolvePost=req=>
  new Promise(resolve=> {
    let chunk="";
    req.on("data", data=> {
      chunk +=data;
    });
    req.on("end", ()=> {
      resolve(JSON.parse(chunk));
    });
  });

server.on("request", async (req, res)=> {
  if (req.url==="/verify") {
+    const data=await resolvePost(req);
+    const { fileHash, filename }=data;
+    const ext=extractExt(filename);
+    const filePath=`${UPLOAD_DIR}/${fileHash}${ext}`;
+    if (fse.existsSync(filePath)) {
+      res.end(
+        JSON.stringify({
+          shouldUpload: false
+        })
+      );
+    } else {
+      res.end(
+        JSON.stringify({
+          shouldUpload: true
+        })
+      );
+    }
  }
});
server.listen(3000, ()=> console.log("正在监听 3000 端口"));

暂停上传

讲完了生成 hash 和文件秒传，回到断点续传。

断点续传顾名思义即断点 + 续传，所以我们第一步先实现"断点"，也就是暂停上传。

原理是使用 XMLHttpRequest 的 abort 方法，可以取消一个 xhr 请求的发送，为此我们需要将上传每个切片的 xhr 对象保存起来，我们再改造一下 request 方法。

   request({
      url,
      method="post",
      data,
      headers={},
      onProgress=e=> e,
+     requestList
    }) {
      return new Promise(resolve=> {
        const xhr=new XMLHttpRequest();
        xhr.upload.onprogress=onProgress;
        xhr.open(method, url);
        Object.keys(headers).forEach(key=>
          xhr.setRequestHeader(key, headers[key])
        );
        xhr.send(data);
        xhr.onload=e=> {
+          // 将请求成功的 xhr 从列表中删除
+          if (requestList) {
+            const xhrIndex=requestList.findIndex(item=> item===xhr);
+            requestList.splice(xhrIndex, 1);
+          }
          resolve({
            data: e.target.response
          });
        };
+        // 暴露当前 xhr 给外部
+        requestList?.push(xhr);
      });
    },

这样在上传切片时传入 requestList 数组作为参数，request 方法就会将所有的 xhr 保存在数组中了。

每当一个切片上传成功时，将对应的 xhr 从 requestList 中删除，所以 requestList 中只保存正在上传切片的 xhr。

之后新建一个暂停按钮，当点击按钮时，调用保存在 requestList 中 xhr 的 abort 方法，即取消并清空所有正在上传的切片。

handlePause() {
    this.requestList.forEach(xhr=> xhr?.abort());
    this.requestList=[];
}

点击暂停按钮可以看到 xhr 都被取消了。

恢复上传

之前在介绍断点续传的时提到使用第二种服务端存储的方式实现续传

由于当文件切片上传后，服务端会建立一个文件夹存储所有上传的切片，所以每次前端上传前可以调用一个接口，服务端将已上传的切片的切片名返回，前端再跳过这些已经上传切片，这样就实现了"续传"的效果

而这个接口可以和之前秒传的验证接口合并，前端每次上传前发送一个验证的请求，返回两种结果：

服务端已存在该文件，不需要再次上传。
服务端不存在该文件或者已上传部分文件切片，通知前端进行上传，并把已上传的文件切片返回给前端。

所以我们改造一下之前文件秒传的服务端验证接口：

const extractExt=filename=>
  filename.slice(filename.lastIndexOf("."), filename.length); // 提取后缀名
const UPLOAD_DIR=path.resolve(__dirname, "..", "target"); // 大文件存储目录

const resolvePost=req=>
  new Promise(resolve=> {
    let chunk="";
    req.on("data", data=> {
      chunk +=data;
    });
    req.on("end", ()=> {
      resolve(JSON.parse(chunk));
    });
  });

+  // 返回已经上传切片名列表
+ const createUploadedList=async fileHash=>
+   fse.existsSync(`${UPLOAD_DIR}/${fileHash}`)
+    ? await fse.readdir(`${UPLOAD_DIR}/${fileHash}`)
+    : [];

server.on("request", async (req, res)=> {
  if (req.url==="/verify") {
    const data=await resolvePost(req);
    const { fileHash, filename }=data;
    const ext=extractExt(filename);
    const filePath=`${UPLOAD_DIR}/${fileHash}${ext}`;
    if (fse.existsSync(filePath)) {
      res.end(
        JSON.stringify({
          shouldUpload: false
        })
      );
    } else {
      res.end(
        JSON.stringify({
          shouldUpload: true，
+         uploadedList: await createUploadedList(fileHash)
        })
      );
    }
  }
});
server.listen(3000, ()=> console.log("正在监听 3000 端口"));

接着回到前端，前端有两个地方需要调用验证的接口:

点击上传时，检查是否需要上传和已上传的切片。
点击暂停后的恢复上传，返回已上传的切片。

新增恢复按钮并改造原来上传切片的逻辑：

<template>
  <div id="app">
      <input
        type="file"
        @change="handleFileChange"
      />
       <el-button @click="handleUpload">上传</el-button>
       <el-button @click="handlePause" v-if="isPaused">暂停</el-button>
+      <el-button @click="handleResume" v-else>恢复</el-button>
      //...
    </div>
</template>

+   async handleResume() {
+      const { uploadedList }=await this.verifyUpload(
+        this.container.file.name,
+        this.container.hash
+      );
+      await this.uploadChunks(uploadedList);
    },
    async handleUpload() {
      if (!this.container.file) return;
      const fileChunkList=this.createFileChunk(this.container.file);
      this.container.hash=await this.calculateHash(fileChunkList);

+     const { shouldUpload, uploadedList }=await this.verifyUpload(
        this.container.file.name,
        this.container.hash
      );
      if (!shouldUpload) {
        this.$message.success("秒传：上传成功");
        return;
      }

      this.data=fileChunkList.map(({ file }, index)=> ({
        fileHash: this.container.hash,
        index,
        hash: this.container.hash + "-" + index,
        chunk: file，
        percentage: 0
      }));

+      await this.uploadChunks(uploadedList);
    },
   // 上传切片，同时过滤已上传的切片
+   async uploadChunks(uploadedList=[]) {
      const requestList=this.data
+        .filter(({ hash })=> !uploadedList.includes(hash))
        .map(({ chunk, hash, index })=> {
          const formData=new FormData();
          formData.append("chunk", chunk);
          formData.append("hash", hash);
          formData.append("filename", this.container.file.name);
          formData.append("fileHash", this.container.hash);
          return { formData, index };
        })
        .map(async ({ formData, index })=>
          this.request({
            url: "http://localhost:3000",
            data: formData,
            onProgress: this.createProgressHandler(this.data[index]),
            requestList: this.requestList
          })
        );
      await Promise.all(requestList);
      // 之前上传的切片数量 + 本次上传的切片数量=所有切片数量时
      // 合并切片
+      if (uploadedList.length + requestList.length===this.data.length) {
         await this.mergeRequest();
+      }
    }

这里给原来上传切片的函数新增 uploadedList 参数，即上图中服务端返回的切片名列表，通过 filter 过滤掉已上传的切片，并且由于新增了已上传的部分，所以之前合并接口的触发条件做了一些改动。

到这里断点续传的功能基本完成了。

进度条改进

虽然实现了断点续传，但还需要修改一下进度条的显示规则，否则在暂停上传/接收到已上传切片时的进度条会出现偏差。

切片进度条

由于在点击上传/恢复上传时，会调用验证接口返回已上传的切片，所以需要将已上传切片的进度变成 100%。

   async handleUpload() {
      if (!this.container.file) return;
      const fileChunkList=this.createFileChunk(this.container.file);
      this.container.hash=await this.calculateHash(fileChunkList);
      const { shouldUpload, uploadedList }=await this.verifyUpload(
        this.container.file.name,
        this.container.hash
      );
      if (!shouldUpload) {
        this.$message.success("秒传：上传成功");
        return;
      }
      this.data=fileChunkList.map(({ file }, index)=> ({
        fileHash: this.container.hash,
        index,
        hash: this.container.hash + "-" + index,
        chunk: file,
+       percentage: uploadedList.includes(index) ? 100 : 0
      }));
      await this.uploadChunks(uploadedList);
    },

uploadedList 会返回已上传的切片，在遍历所有切片时判断当前切片是否在已上传列表里即可。

文件进度条

之前说到文件进度条是一个计算属性，根据所有切片的上传进度计算而来，这就遇到了一个问题：

点击暂停会取消并清空切片的 xhr 请求，此时如果已经上传了一部分，就会发现文件进度条有倒退的现象：

当点击恢复时，由于重新创建了 xhr 导致切片进度清零，所以总进度条就会倒退。

解决方案是创建一个"假"的进度条，这个假进度条基于文件进度条，但只会停止和增加，然后给用户展示这个假的进度条

这里我们使用 Vue 的监听属性：

  data: ()=> ({
+    fakeUploadPercentage: 0
  }),
  computed: {
    uploadPercentage() {
      if (!this.container.file || !this.data.length) return 0;
      const loaded=this.data
        .map(item=> item.size * item.percentage)
        .reduce((acc, cur)=> acc + cur);
      return parseInt((loaded / this.container.file.size).toFixed(2));
    }
  },
  watch: {
+    uploadPercentage(now) {
+      if (now > this.fakeUploadPercentage) {
+        this.fakeUploadPercentage=now;
+      }
    }
  },

当 uploadPercentage 即真的文件进度条增加时，fakeUploadPercentage 也增加，一旦文件进度条后退，假的进度条只需停止即可。

至此一个大文件上传 + 断点续传的解决方案就完成了

总结

大文件上传：

前端上传大文件时使用 Blob.prototype.slice 将文件切片，并发上传多个切片，最后发送一个合并的请求通知服务端合并切片。
服务端接收切片并存储，收到合并请求后使用 fs.appendFileSync 对多个切片进行合并。
原生 XMLHttpRequest 的 upload.onprogress 对切片上传进度的监听。
使用 Vue 计算属性根据每个切片的进度算出整个文件的上传进度。

断点续传：

使用 spart-md5 根据文件内容算出文件 hash。
通过 hash 可以判断服务端是否已经上传该文件，从而直接提示用户上传成功（秒传）。
通过 XMLHttpRequest 的 abort 方法暂停切片的上传。
上传前服务端返回已经上传的切片名，前端跳过这些切片的上传。

源代码

源代码增加了一些按钮的状态，交互更加友好，文章表达比较晦涩的地方可以跳转到源代码查看

file-upload[2]

参考资料

写给新手前端的各种文件上传攻略，从小图片到大文件断点续传[3]
Blob.slice[4]

关注我

大家好，这里是 FEHub，每天早上 9 点更新，为你严选优质文章，与你一起进步。

如果喜欢这篇文章，记得点赞，转发。让你的好基友和你一样优秀。

感谢大家的支持
吃饭时加个鸡腿
咱们明天见 ?:)?

欢迎关注「FEHub」，每天进步一点点

您的项目需求

*请认真填写需求信息，我们会在24小时内与您取得联系。

整合营销服务商

万字长文入门前端全球化

概念说明

苹果键盘怎么卖多国?

全球化=国际化i18n+本地化l10n

国际化

Intl

前置知识

语言标识

文字阅读顺序

布局

content-flows

writing mode 决定 content-flows

vi和vb

JS的scrollLeft

locale

Intl. Locale

语言声明

language设置

script设置

region设置

文本

阅读顺序声明

html标签dir属性

css属性writing-mode(https://developer.mozilla.org/en-US/docs/Web/CSS/writing-mode)

翻转

多语言文案

程序打包嵌入文案

接口获取

实时翻译替换

占位符与单复数处理-ICU语法

Intl. Segmenter 分段器

时间&时区

GMT和UTC

「Offset与Timezone」

如何获取当前用户的时区信息

DST

时间处理

Dayjs插件

Intl API

本地时间输出

「Intl. RelativeTimeFormat」相对时间

单复数

数字格式化

整数分隔和小数分隔

百分比

缩写

货币

货币符号

货币格式化

货币单位显示

排序&列表

Intl. Collator

字符串比较

数组排序

Intl.「ListFormat」

日历

星期

日历单位

每周第一天

如何写一个国际化日历组件

实现细节

Intl支持

支持查询

低版本兼容

formatjs

Why not Babel?

Nodejs使用

前言

大文件上传

前端

服务端

前端部分

上传控件

请求逻辑

上传切片

发送合并请求

服务端部分

接受切片

合并切片