整合营销服务商

电脑端+手机端+微信端=数据同步管理

免费咨询热线:

这一年,NLP突破进展真不少:BERT变体遍出,纪录

这一年,NLP突破进展真不少:BERT变体遍出,纪录一破再破

三 发自 凹非寺
量子位 报道 | 公众号 QbitAI

2019年,自然语言处理(NLP)都取得了哪些突破?

提到NLP,BERT可以说是家喻户晓。

在情感分析、问答、句子相似度等多个 NLP 任务上都取得了优异的成绩。

而且,无论是在类似于Kaggle这样的竞赛,或者媒体报道中,也总能看到它的身影。

它发表于2018年末,自那之后的一年,NLP和NLU(自然语言理解)领域有了较大的发展。

那么,以BERT的发布作为时间节点,本文便梳理了一下在此之前和之后,NLP领域的重要项目和模型。

BERT之前的一些主要 NLP 项目时间表

在提出BERT模型之前,NLP领域中的主要项目按时间排序,如下图所示:

Word2Vec模型发布于2013年1月,至今也是非常流行。

在任何NLP任务中,研究人员可能尝试的第一个模型就是它。

https://arxiv.org/abs/1301.3781

FastTextGloVe分别于2016年7月和2014年1月提出。

FastText是一个开源的、免费的、轻量级的库,它允许用户学习文本表示和文本分类器。

https://fasttext.cc/

GloVe是一种无监督的学习算法,用于获取单词的向量表示。

https://nlp.stanford.edu/projects/glove/

Transformer于2017年6月提出,是一种基于 encoder-decoder 结构的模型。

在机器翻译任务上的表现超过了 RNN,CNN,只用 encoder-decoder 和 attention 机制就能达到很好的效果,最大的优点是可以高效地并行化。

https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html

ELMo于2018年2月提出,利用预训练好的双向语言模型,然后根据具体输入从该语言模型中可以得到上下文依赖的当前词表示,再当成特征加入到具体的NLP有监督模型里。

https://allennlp.org/elmo

还有一个叫Ulmfit,是面向NLP任务的迁移学习模型,只需使用极少量的标记数据,文本分类精度就能和数千倍的标记数据训练量达到同等水平。

https://arxiv.org/abs/1801.06146

值得注意的是,ELMo和Ulmfit出现在BERT之前,没有采用基于Transformer的结构。

BERT

BERT模型于2018年10月提出。

全称是Bidirectional Encoder Representation from Transformers,即双向Transformer的Encoder(因为decoder不能获取要预测的信息)。

△论文地址:https://arxiv.org/abs/1810.04805

模型的主要创新点都在pre-train方法上,即用了Masked LM和Next Sentence Prediction两种方法分别捕捉词语和句子级别的表示。

谷歌甚至开始使用BERT来改善搜索结果。

奉上一份较为详细的BERT模型教程:
http://jalammar.github.io/illustrated-bert/

预训练权重相关内容可以从官方 Github repo 下载:
https://github.com/google-research/bert

Bert 也可以作为 Tensorflow hub 模块:
https://tfhub.dev/google/collections/bert/1

文末还会奉上各种非常实用的库。

BERT之后的一些主要 NLP 项目时间表

在谷歌提出BERT之后,NLP领域也相继出了其他较为突出的工作项目。

Transformer-XL

Transormer-XL是Transformer的升级版,在速度方面比Transformer快1800多倍。

这里的XL,指的是extra long,意思是超长,表示Transformer-XL在语言建模中长距离依赖问题上有非常好的表现。同时,也暗示着它就是为长距离依赖问题而生。

长距离依赖问题,是当前文本处理模型面临的难题,也是RNN失败的地方。

相比之下,Transformer-XL学习的依赖要比RNN长80%。比Vanilla Transformers快450%。

在短序列和长序列上,都有很好的性能表现。

https://arxiv.org/abs/1901.02860

GPT-2

GPT-2可以说是在BERT之后,媒体报道最为关注的一个NLP模型。

这是OpenAI发布的一个“逆天”的语言AI,整个模型包含15亿个参数。

无需针对性训练就能横扫各种特定领域的语言建模任务,还具备阅读理解、问答、生成文章摘要、翻译等等能力。

而且,OpenAI最初还担心项目过于强大,而选择没有开源。但在10个月之后,还是决定将其公布。

https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

ERNIE

ERNIE是基于百度自己的深度学习框架飞桨(PaddlePaddle)搭建的,可以同时利用词汇、句法和知识信息。

实验结果显示,在不同的知识驱动任务取得了显著的改进,同时在其它常见任务上与现有的BERT模型具有可比性。

当前,ERNIE 2.0版本在GLUE排行榜上排名第一。
https://github.com/PaddlePaddle/ERNIE

XLNET

XLNet 是一个类似BERT的模型,是一种通用的自回归预训练方法。

它不使用传统 AR 模型中固定的前向或后向因式分解顺序,而是最大化所有可能因式分解顺序的期望对数似然。

其次,作为一个泛化 AR 语言模型,XLNet不依赖残缺数据。

此外,XLNet还改进了预训练的架构设计。

https://arxiv.org/abs/1906.08237

RoBERTa

RoBERTa由Facebook提出。

它在模型层面没有改变谷歌的BERT,改变的只是预训练的方法。

在模型规模、算力和数据上,与BERT相比主要有以下几点改进:

更大的模型参数量:模型使用 1024 块 V100 GPU 训练了 1 天的时间。

更大bacth size:RoBERTa在训练过程中使用了更大的bacth size,尝试过从 256 到 8000 不等的bacth size。

更多的训练数据:包括CC-NEWS 等在内的160GB纯文本。

https://arxiv.org/abs/1907.11692

Salesforce CTRL

CTRL全名是Conditional Transformer Language,包含16亿个参数。

它具有强大且可控的人工文本生成功能,可以预测哪个训练数据子集对生成的文本序列影响最大。

通过识别模型中最有影响力的训练数据来源,为分析大量生成的文本提供了一种潜在的方法。

CTRL还可以通过微调特定任务或转移模型已学习的表示形式来改进其他NLP应用程序。

https://blog.einstein.ai/introducing-a-conditional-transformer-language-model-for-controllable-generation/

ALBERT

ALBERT是谷歌发布的轻量级BERT模型。

比BERT模型参数小18倍,性能还超越了它,在SQuAD和RACE测试上创造了新的SOTA。

前不久,谷歌还对此进行了升级,发布了ALBERT 2和中文版本。

在这个版本中,“no dropout”、“additional training data”、“long training time”策略将应用到所有的模型。

从性能的比较来说,对于ALBERT-base、ALBERT-large和ALBERT-xlarge,v2版要比v1版好得多。

说明采用上述三个策略的重要性。

https://arxiv.org/abs/1909.11942

性能评测基准

评估这些语言模型的方法之一是Glue Benchmark

它包括评估模型的各种NLP任务,如分类、问答等。

在Glue Benchmark刚刚发布的时候,BERT模型的性能位居榜首。

但截至2020年1月2日,在仅仅1年时间内,BERT已经排名到了19位。

现在还有一个 SuperGlue 基准测试,它包含了更难理解的语言任务。

对于评估问题回答系统,SQuAD是较为常用的。

BERT和基于transformer模型在此处的性能是较好的。

其它与BERT相关项目

DistilBERT

DistilBERT是HuggingFace发布的小型NLP transformer模型,与BERT的架构类似,不过它仅使用了 6600 万参数,但在 GLUE 基准上实现了BERT 95% 的性能。

https://arxiv.org/abs/1910.01108

Megatron-LM

Megatron-LM是英伟达发布的NLP模型。

英伟达用自己的硬件与并行计算软件相结合,当时创下了三项纪录:

训练速度只需53分钟;
推理速度只需2.2ms;
包含83亿参数。

https://github.com/NVIDIA/Megatron-LM

BioBERT

BioBERT是用于生物医学文本挖掘的预训练生物医学语言表示模型。

在生物医学语料库上进行预培训时,它在各种生物医学文本挖掘任务上的表现,在很大程度上超过了BERT和之前的先进模型。

https://github.com/dmis-lab/biobert

CamemBERT

CamemBERT是一种基于RoBERTa 结构的法语语言模型。

https://camembert-model.fr/

NLP库

下面是作者认为需要了解的一些NLP库。

Spacy

Spacy 是一个流行的、快速的NLP程序库,可以处理各种自然语言处理任务,如标记、词性等。它还提供了预先训练的NER等模型。

https://spacy.io/

HuggingFace Transformers

它是首批提供 BERT Pytorch实现的库之一,最初被称为“ Pytorch-pretrained-BERT”。

后来,他们增加了更多的模型,如GPT-2,XLNET等。

在不到一年的时间里,它已经成为最流行的 NLP 库之一,并且使得BERT和其他模型的使用变得更加容易。

https://github.com/huggingface/transformers

AllenNLP

AllenNLP是来自艾伦人工智能研究所(Allen Institute of AI)的NLP库,基于PyTorch。

https://allennlp.org/

Flair

Flair也是一个带有 NER、 POS 等模型的 NLP 库,还支持 BERT、 ELMO、 XLNET 等嵌入。

https://github.com/flairNLP/flair

GluonNLP

GluonNLP是Apache MXNet 上的NLP工具包,是最早包含预先训练的BERT嵌入式的库之一。

https://gluon-nlp.mxnet.io/

那么,在2020年,NLP又会怎样的突破呢?

传送门

https://towardsdatascience.com/2019-year-of-bert-and-transformer-f200b53d05b9

— 完 —

量子位 QbitAI · 头条号签约

关注我们,第一时间获知前沿科技动态

篇文章主要列举了第三人称的多种控制方式。

一、官方实例的第三人称控制方式。

该控制方式比较复杂,但是却写得很好很完善,并且运用了新的动画系统。大家可以下载官方的角色控制包来使用,附上图一张,不多说。

二、老版官方的第三人称控制方式。

大家应该知道老版的第三人称控制方式是用JavaScript脚本写的,可能大家拿过来还不太好用,但是这里我们把它改写成C#脚本(PS:参照雨松的修改),这样用起来就方便多了,而且用的是经典版的动画系统,满足了很多人的需求。

在unity中,新版的mecanim动画系统出现,虽然说很实用,在某些方面解决了很多人的需求,但这并不意味着可以替代原版经典的动画系统,所以到现在为止,两种动画都是通用的。

using UnityEngine;

using System.Collections;

[RequireComponent(typeof(CharacterController))]

public class ThirdPersonController111 : MonoBehaviour

{

public AnimationClip idleAnimation;

public AnimationClip walkAnimation;

public AnimationClip runAnimation;

public AnimationClip jumpPoseAnimation;

public float walkMaxAnimationSpeed=0.75f;

public float trotMaxAnimationSpeed=1.0f;

public float runMaxAnimationSpeed=1.0f;

public float jumpAnimationSpeed=1.15f;

public float landAnimationSpeed=1.0f;

private Animation _animation;

enum CharacterState

{

Idle=0,

Walking=1,

Trotting=2,

Running=3,

Jumping=4,

}

private CharacterState _characterState;

// The speed when walking

float walkSpeed=2.0f;

// after trotAfterSeconds of walking we trot with trotSpeed

float trotSpeed=4.0f;

// when pressing "Fire3" button (cmd) we start running

float runSpeed=6.0f;

float inAirControlAcceleration=3.0f;

// How high do we jump when pressing jump and letting go immediately

float jumpHeight=0.5f;

// The gravity for the character

float gravity=20.0f;

// The gravity in controlled descent mode

float speedSmoothing=10.0f;

float rotateSpeed=500.0f;

float trotAfterSeconds=3.0f;

bool canJump=true;

private float jumpRepeatTime=0.05f;

private float jumpTimeout=0.15f;

private float groundedTimeout=0.25f;

// The camera doesnt start following the target immediately but waits for a split second to avoid too much waving around.

private float lockCameraTimer=0.0f;

// The current move direction in x-z

private Vector3 moveDirection=Vector3.zero;

// The current vertical speed

private float verticalSpeed=0.0f;

// The current x-z move speed

private float moveSpeed=0.0f;

// The last collision flags returned from controller.Move

private CollisionFlags collisionFlags;

// Are we jumping? (Initiated with jump button and not grounded yet)

private bool jumping=false;

private bool jumpingReachedApex=false;

// Are we moving backwards (This locks the camera to not do a 180 degree spin)

private bool movingBack=false;

// Is the user pressing any keys?

private bool isMoving=false;

// When did the user start walking (Used for going into trot after a while)

private float walkTimeStart=0.0f;

// Last time the jump button was clicked down

private float lastJumpButtonTime=-10.0f;

// Last time we performed a jump

private float lastJumpTime=-1.0f;

// the height we jumped from (Used to determine for how long to apply extra jump power after jumping.)

private float lastJumpStartHeight=0.0f;

private Vector3 inAirVelocity=Vector3.zero;

private float lastGroundedTime=0.0f;

private bool isControllable=true;

void Awake()

{

moveDirection=transform.TransformDirection(Vector3.forward);

_animation=GetComponent<Animation>();

if (!_animation)

Debug.Log("The character you would like to control doesn't have animations. Moving her might look weird.");

/*

public var idleAnimation : AnimationClip;

public var walkAnimation : AnimationClip;

public var runAnimation : AnimationClip;

public var jumpPoseAnimation : AnimationClip;

*/

if (!idleAnimation)

{

_animation=null;

Debug.Log("No idle animation found. Turning off animations.");

}

if (!walkAnimation)

{

_animation=null;

Debug.Log("No walk animation found. Turning off animations.");

}

if (!runAnimation)

{

_animation=null;

Debug.Log("No run animation found. Turning off animations.");

}

if (!jumpPoseAnimation && canJump)

{

_animation=null;

Debug.Log("No jump animation found and the character has canJump enabled. Turning off animations.");

}

}

void UpdateSmoothedMovementDirection()

{

Transform cameraTransform=Camera.main.transform;

bool grounded=IsGrounded();

// Forward vector relative to the camera along the x-z plane

Vector3 forward=cameraTransform.TransformDirection(Vector3.forward);

forward.y=0;

forward=forward.normalized;

// Right vector relative to the camera

// Always orthogonal to the forward vector

Vector3 right=new Vector3(forward.z, 0, -forward.x);

float v=Input.GetAxisRaw("Vertical");

float h=Input.GetAxisRaw("Horizontal");

// Are we moving backwards or looking backwards

if (v < -0.2f)

movingBack=true;

else

movingBack=false;

bool wasMoving=isMoving;

isMoving=Mathf.Abs(h) > 0.1f || Mathf.Abs(v) > 0.1f;

// Target direction relative to the camera

Vector3 targetDirection=h * right + v * forward;

// Grounded controls

if (grounded)

{

// Lock camera for short period when transitioning moving & standing still

lockCameraTimer +=Time.deltaTime;

if (isMoving !=wasMoving)

lockCameraTimer=0.0f;

// We store speed and direction seperately,

// so that when the character stands still we still have a valid forward direction

// moveDirection is always normalized, and we only update it if there is user input.

if (targetDirection !=Vector3.zero)

{

// If we are really slow, just snap to the target direction

if (moveSpeed < walkSpeed * 0.9f && grounded)

{

moveDirection=targetDirection.normalized;

}

// Otherwise smoothly turn towards it

else

{

moveDirection=Vector3.RotateTowards(moveDirection, targetDirection, rotateSpeed * Mathf.Deg2Rad * Time.deltaTime, 1000);

moveDirection=moveDirection.normalized;

}

}

// Smooth the speed based on the current target direction

float curSmooth=speedSmoothing * Time.deltaTime;

// Choose target speed

//* We want to support analog input but make sure you cant walk faster diagonally than just forward or sideways

float targetSpeed=Mathf.Min(targetDirection.magnitude, 1.0f);

_characterState=CharacterState.Idle;

// Pick speed modifier

if (Input.GetKey(KeyCode.LeftShift) | Input.GetKey(KeyCode.RightShift))

{

targetSpeed *=runSpeed;

_characterState=CharacterState.Running;

}

else if (Time.time - trotAfterSeconds > walkTimeStart)

{

targetSpeed *=trotSpeed;

_characterState=CharacterState.Trotting;

}

else

{

targetSpeed *=walkSpeed;

_characterState=CharacterState.Walking;

}

moveSpeed=Mathf.Lerp(moveSpeed, targetSpeed, curSmooth);

// Reset walk time start when we slow down

if (moveSpeed < walkSpeed * 0.3f)

walkTimeStart=Time.time;

}

// In air controls

else

{

// Lock camera while in air

if (jumping)

lockCameraTimer=0.0f;

if (isMoving)

inAirVelocity +=targetDirection.normalized * Time.deltaTime * inAirControlAcceleration;

}

}

void ApplyJumping()

{

// Prevent jumping too fast after each other

if (lastJumpTime + jumpRepeatTime > Time.time)

return;

if (IsGrounded())

{

// Jump

// - Only when pressing the button down

// - With a timeout so you can press the button slightly before landing

if (canJump && Time.time < lastJumpButtonTime + jumpTimeout)

{

verticalSpeed=CalculateJumpVerticalSpeed(jumpHeight);

SendMessage("DidJump", SendMessageOptions.DontRequireReceiver);

}

}

}

void ApplyGravity()

{

if (isControllable) // don't move player at all if not controllable.

{

// Apply gravity

bool jumpButton=Input.GetButton("Jump");

// When we reach the apex of the jump we send out a message

if (jumping && !jumpingReachedApex && verticalSpeed <=0.0f)

{

jumpingReachedApex=true;

SendMessage("DidJumpReachApex", SendMessageOptions.DontRequireReceiver);

}

if (IsGrounded())

verticalSpeed=0.0f;

else

verticalSpeed -=gravity * Time.deltaTime;

}

}

float CalculateJumpVerticalSpeed(float targetJumpHeight)

{

// From the jump height and gravity we deduce the upwards speed

// for the character to reach at the apex.

return Mathf.Sqrt(2 * targetJumpHeight * gravity);

}

void DidJump()

{

jumping=true;

jumpingReachedApex=false;

lastJumpTime=Time.time;

lastJumpStartHeight=transform.position.y;

lastJumpButtonTime=-10;

_characterState=CharacterState.Jumping;

}

void Update()

{

if (!isControllable)

{

// kill all inputs if not controllable.

Input.ResetInputAxes();

}

if (Input.GetButtonDown("Jump"))

{

lastJumpButtonTime=Time.time;

}

UpdateSmoothedMovementDirection();

// Apply gravity

// - extra power jump modifies gravity

// - controlledDescent mode modifies gravity

ApplyGravity();

// Apply jumping logic

ApplyJumping();

// Calculate actual motion

Vector3 movement=moveDirection * moveSpeed + new Vector3(0, verticalSpeed, 0) + inAirVelocity;

movement *=Time.deltaTime;

// Move the controller

CharacterController controller=GetComponent<CharacterController>();

collisionFlags=controller.Move(movement);

// ANIMATION sector

if (_animation)

{

if (_characterState==CharacterState.Jumping)

{

if (!jumpingReachedApex)

{

_animation[jumpPoseAnimation.name].speed=jumpAnimationSpeed;

_animation[jumpPoseAnimation.name].wrapMode=WrapMode.ClampForever;

_animation.CrossFade(jumpPoseAnimation.name);

}

else

{

_animation[jumpPoseAnimation.name].speed=-landAnimationSpeed;

_animation[jumpPoseAnimation.name].wrapMode=WrapMode.ClampForever;

_animation.CrossFade(jumpPoseAnimation.name);

}

}

else

{

if (controller.velocity.sqrMagnitude < 0.1f)

{

_animation.CrossFade(idleAnimation.name);

}

else

{

if (_characterState==CharacterState.Running)

{

_animation[runAnimation.name].speed=Mathf.Clamp(controller.velocity.magnitude, 0.0f, runMaxAnimationSpeed);

_animation.CrossFade(runAnimation.name);

}

else if (_characterState==CharacterState.Trotting)

{

_animation[walkAnimation.name].speed=Mathf.Clamp(controller.velocity.magnitude, 0.0f, trotMaxAnimationSpeed);

_animation.CrossFade(walkAnimation.name);

}

else if (_characterState==CharacterState.Walking)

{

_animation[walkAnimation.name].speed=Mathf.Clamp(controller.velocity.magnitude, 0.0f, walkMaxAnimationSpeed);

_animation.CrossFade(walkAnimation.name);

}

}

}

}

// ANIMATION sector

// Set rotation to the move direction

if (IsGrounded())

{

transform.rotation=Quaternion.LookRotation(moveDirection);

}

else

{

Vector3 xzMove=movement;

xzMove.y=0;

if (xzMove.sqrMagnitude > 0.001f)

{

transform.rotation=Quaternion.LookRotation(xzMove);

}

}

// We are in jump mode but just became grounded

if (IsGrounded())

{

lastGroundedTime=Time.time;

inAirVelocity=Vector3.zero;

if (jumping)

{

jumping=false;

SendMessage("DidLand", SendMessageOptions.DontRequireReceiver);

}

}

}

void OnControllerColliderHit(ControllerColliderHit hit)

{

// Debug.DrawRay(hit.point, hit.normal);

if (hit.moveDirection.y > 0.01f)

return;

}

float GetSpeed()

{

return moveSpeed;

}

public bool IsJumping()

{

return jumping;

}

bool IsGrounded()

{

return (collisionFlags & CollisionFlags.CollidedBelow) !=0;

}

Vector3 GetDirection()

{

return moveDirection;

}

public bool IsMovingBackwards()

{

return movingBack;

}

public float GetLockCameraTimer()

{

return lockCameraTimer;

}

bool IsMoving()

{

return Mathf.Abs(Input.GetAxisRaw("Vertical")) + Mathf.Abs(Input.GetAxisRaw("Horizontal")) > 0.5f;

}

bool HasJumpReachedApex()

{

return jumpingReachedApex;

}

bool IsGroundedWithTimeout()

{

return lastGroundedTime + groundedTimeout > Time.time;

}

void Reset()

{

gameObject.tag="Player";

}

}

图一张:

三、根据需求,自己写自己需要的控制方式。

在本期训练营中,主角超级玛丽我才用了一种比较简洁的控制方式,因为这种方式已经能够满足需求,该种方式就是前后左右移动的方式。该方式不需要添加charactercontroller,只需添加胶囊体就可。(PS:不过该方式有个缺点就是必须朝向固定,也就是只能朝向Z轴正方向)

代码如下:

using UnityEngine;

using System.Collections;

public class MarioMove : MonoBehaviour

{

public float speed=5.0f;

public static bool isGround;

public static bool IsAllowJump;

[SerializeField]

float m_StationaryTurnSpeed=180;

[SerializeField]

float m_MovingTurnSpeed=360;

float m_ForwardAmount;

float m_TurnAmount;

Vector3 m_GroundNormal;

private Vector3 m_Move;

private Transform m_Cam;

private Vector3 m_CamForward;

// Use this for initialization

void Start()

{

// get the transform of the main camera

if (Camera.main !=null)

{

m_Cam=Camera.main.transform;

}

else

{

Debug.LogWarning(

"Warning: no main camera found. Third person character needs a Camera tagged \"MainCamera\", for camera-relative controls.");

// we use self-relative controls in this case, which probably isn't what the user wants, but hey, we warned them!

}

}

void OnCollisionEnter(Collision collision)

{

//if (collision.collider.tag=="Ground")

if (collision.collider.tag !=null)

{

isGround=true;

IsAllowJump=true;

}

else

{

isGround=false;

}

}

// Update is called once per frame

void Update()

{

float h=Input.GetAxis("Horizontal");

float v=Input.GetAxis("Vertical");

GetComponent<Rigidbody>().MovePosition(transform.position - new Vector3(h, 0, v) * speed * Time.deltaTime);

if (isGround==true && Input.GetButton("Jump"))

{

if (IsAllowJump==true)

{

transform.GetComponentInChildren<Animation>().CrossFade("jump");

GetComponent<Rigidbody>().MovePosition(transform.position - new Vector3(-h * 0.1f, -0.15f, -v * 0.1f));

}

}

else if (Input.GetButtonUp("Jump"))

{

IsAllowJump=false;

}

else

{

if (Input.GetAxis("Vertical") > 0.5f ||

Input.GetAxis("Vertical") < -0.5f ||

Input.GetAxis("Horizontal") > 0.5f ||

Input.GetAxis("Horizontal") < -0.5f)

{

transform.GetComponentInChildren<Animation>().CrossFade("run");

}

else if ((Input.GetAxis("Vertical") > 0.0f && Input.GetAxis("Vertical") < 0.5f) ||

(Input.GetAxis("Vertical") > -0.5f && Input.GetAxis("Vertical") < 0.0f) ||

(Input.GetAxis("Horizontal") > 0.0f && Input.GetAxis("Horizontal") < 0.5f) ||

(Input.GetAxis("Horizontal") < 0.0f && Input.GetAxis("Horizontal") > -0.5f))

{

transform.GetComponentInChildren<Animation>().CrossFade("walk");

}

else

{

transform.GetComponentInChildren<Animation>().CrossFade("idle");

}

}

if (m_Cam !=null)

{

// calculate camera relative direction to move:

m_CamForward=Vector3.Scale(m_Cam.forward, new Vector3(1, 0, 1)).normalized;

m_Move=v * m_CamForward + h * m_Cam.right;

}

else

{

// we use world-relative directions in the case of no main camera

m_Move=v * Vector3.forward + h * Vector3.right;

}

Move(m_Move);

}

public void Move(Vector3 move)

{

// convert the world relative moveInput vector into a local-relative

// turn amount and forward amount required to head in the desired

// direction.

if (move.magnitude > 1f) move.Normalize();

move=transform.InverseTransformDirection(move);

//CheckGroundStatus();

move=Vector3.ProjectOnPlane(move, m_GroundNormal);

m_TurnAmount=Mathf.Atan2(move.x, move.z);

m_ForwardAmount=move.z;

ApplyExtraTurnRotation();

}

void ApplyExtraTurnRotation()

{

// help the character turn faster (this is in addition to root rotation in the animation)

float turnSpeed=Mathf.Lerp(m_StationaryTurnSpeed, m_MovingTurnSpeed, m_ForwardAmount);

transform.Rotate(0, m_TurnAmount * turnSpeed * Time.deltaTime, 0);

}

}

在跳跃的代码部分,这样写的目的是实现了按跳跃键的时间长短跳的高度不同,和大家小时候玩的超级玛丽游戏的感觉很像。

附图一张:

原文链接:http://www.manew.com/thread-98040-1-1.html

本文主要研究一下Java 9的Compact Strings

Compressed Strings(Java 6)

Java 6引入了Compressed Strings,对于one byte per character使用byte[],对于two bytes per character继续使用char[];之前可以使用-XX:+UseCompressedStrings来开启,不过在java7被废弃了,然后在java8被移除

Compact Strings(Java 9)

Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16

String

java.base/java/lang/String.java

public final class String
 implements java.io.Serializable, Comparable<String>, CharSequence,
 Constable, ConstantDesc {
?
 /**
 * The value is used for character storage.
 *
 * @implNote This field is trusted by the VM, and is a subject to
 * constant folding if String instance is constant. Overwriting this
 * field after construction will cause problems.
 *
 * Additionally, it is marked with {@link Stable} to trust the contents
 * of the array. No other facility in JDK provides this functionality (yet).
 * {@link Stable} is safe here, because value is never null.
 */
 @Stable
 private final byte[] value;
?
 /**
 * The identifier of the encoding used to encode the bytes in
 * {@code value}. The supported values in this implementation are
 *
 * LATIN1
 * UTF16
 *
 * @implNote This field is trusted by the VM, and is a subject to
 * constant folding if String instance is constant. Overwriting this
 * field after construction will cause problems.
 */
 private final byte coder;
?
 /** Cache the hash code for the string */
 private int hash; // Default to 0
?
 /** use serialVersionUID from JDK 1.0.2 for interoperability */
 private static final long serialVersionUID=-6849794470754667710L;
?
 /**
 * If String compaction is disabled, the bytes in {@code value} are
 * always encoded in UTF16.
 *
 * For methods with several possible implementation paths, when String
 * compaction is disabled, only one code path is taken.
 *
 * The instance field value is generally opaque to optimizing JIT
 * compilers. Therefore, in performance-sensitive place, an explicit
 * check of the static boolean {@code COMPACT_STRINGS} is done first
 * before checking the {@code coder} field since the static boolean
 * {@code COMPACT_STRINGS} would be constant folded away by an
 * optimizing JIT compiler. The idioms for these cases are as follows.
 *
 * For code such as:
 *
 * if (coder==LATIN1) { ... }
 *
 * can be written more optimally as
 *
 * if (coder()==LATIN1) { ... }
 *
 * or:
 *
 * if (COMPACT_STRINGS && coder==LATIN1) { ... }
 *
 * An optimizing JIT compiler can fold the above conditional as:
 *
 * COMPACT_STRINGS==true=> if (coder==LATIN1) { ... }
 * COMPACT_STRINGS==false=> if (false) { ... }
 *
 * @implNote
 * The actual value for this field is injected by JVM. The static
 * initialization block is used to set the value here to communicate
 * that this static final field is not statically foldable, and to
 * avoid any possible circular dependency during vm initialization.
 */
 static final boolean COMPACT_STRINGS;
?
 static {
 COMPACT_STRINGS=true;
 }
?
 /**
 * Class String is special cased within the Serialization Stream Protocol.
 *
 * A String instance is written into an ObjectOutputStream according to
 * <a href="{@docRoot}/../specs/serialization/protocol.html#stream-elements">
 * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
 */
 private static final ObjectStreamField[] serialPersistentFields=new ObjectStreamField[0];
?
 /**
 * Initializes a newly created {@code String} object so that it represents
 * an empty character sequence. Note that use of this constructor is
 * unnecessary since Strings are immutable.
 */
 public String() {
 this.value="".value;
 this.coder="".coder;
 }
?
 //......
?
 public char charAt(int index) {
 if (isLatin1()) {
 return StringLatin1.charAt(value, index);
 } else {
 return StringUTF16.charAt(value, index);
 }
 }
?
 public boolean equals(Object anObject) {
 if (this==anObject) {
 return true;
 }
 if (anObject instanceof String) {
 String aString=(String)anObject;
 if (coder()==aString.coder()) {
 return isLatin1() ? StringLatin1.equals(value, aString.value)
 : StringUTF16.equals(value, aString.value);
 }
 }
 return false;
 }
?
 public int compareTo(String anotherString) {
 byte v1[]=value;
 byte v2[]=anotherString.value;
 if (coder()==anotherString.coder()) {
 return isLatin1() ? StringLatin1.compareTo(v1, v2)
 : StringUTF16.compareTo(v1, v2);
 }
 return isLatin1() ? StringLatin1.compareToUTF16(v1, v2)
 : StringUTF16.compareToLatin1(v1, v2);
 }
?
 public int hashCode() {
 int h=hash;
 if (h==0 && value.length > 0) {
 hash=h=isLatin1() ? StringLatin1.hashCode(value)
 : StringUTF16.hashCode(value);
 }
 return h;
 }
?
 public int indexOf(int ch, int fromIndex) {
 return isLatin1() ? StringLatin1.indexOf(value, ch, fromIndex)
 : StringUTF16.indexOf(value, ch, fromIndex);
 }
?
 public String substring(int beginIndex) {
 if (beginIndex < 0) {
 throw new StringIndexOutOfBoundsException(beginIndex);
 }
 int subLen=length() - beginIndex;
 if (subLen < 0) {
 throw new StringIndexOutOfBoundsException(subLen);
 }
 if (beginIndex==0) {
 return this;
 }
 return isLatin1() ? StringLatin1.newString(value, beginIndex, subLen)
 : StringUTF16.newString(value, beginIndex, subLen);
 }
?
 //......
?
 byte coder() {
 return COMPACT_STRINGS ? coder : UTF16;
 }
?
 byte[] value() {
 return value;
 }
?
 private boolean isLatin1() {
 return COMPACT_STRINGS && coder==LATIN1;
 }
?
 @Native static final byte LATIN1=0;
 @Native static final byte UTF16=1;
?
 //......
}
  • COMPACT_STRINGS默认为true,即该特性默认是开启的
  • coder方法判断COMPACT_STRINGS为true的话,则返回coder值,否则返回UTF16;isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true
  • 诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16

StringConcatFactory

实例

public class Java9StringDemo {
?
 public static void main(String[] args){
 String stringLiteral="tom";
 String stringObject=stringLiteral + "cat";
 }
}
  • 这段代码stringObject由变量stringLiteral及cat拼接而来

javap

javac src/main/java/com/example/javac/Java9StringDemo.java
javap -v src/main/java/com/example/javac/Java9StringDemo.class
?
 Last modified 2019年4月7日; size 770 bytes
 MD5 checksum fecfca9c829402c358c4d5cb948004ff
 Compiled from "Java9StringDemo.java"
public class com.example.javac.Java9StringDemo
 minor version: 0
 major version: 56
 flags: (0x0021) ACC_PUBLIC, ACC_SUPER
 this_class: #4 // com/example/javac/Java9StringDemo
 super_class: #5 // java/lang/Object
 interfaces: 0, fields: 0, methods: 2, attributes: 3
Constant pool:
 #1=Methodref #5.#14 // java/lang/Object."<init>":()V
 #2=String #15 // tom
 #3=InvokeDynamic #0:#19 // #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
 #4=Class #20 // com/example/javac/Java9StringDemo
 #5=Class #21 // java/lang/Object
 #6=Utf8 <init>
 #7=Utf8 ()V
 #8=Utf8 Code
 #9=Utf8 LineNumberTable
 #10=Utf8 main
 #11=Utf8 ([Ljava/lang/String;)V
 #12=Utf8 SourceFile
 #13=Utf8 Java9StringDemo.java
 #14=NameAndType #6:#7 // "<init>":()V
 #15=Utf8 tom
 #16=Utf8 BootstrapMethods
 #17=MethodHandle 6:#22 // REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
 #18=String #23 // \u0001cat
 #19=NameAndType #24:#25 // makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
 #20=Utf8 com/example/javac/Java9StringDemo
 #21=Utf8 java/lang/Object
 #22=Methodref #26.#27 // java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
 #23=Utf8 \u0001cat
 #24=Utf8 makeConcatWithConstants
 #25=Utf8 (Ljava/lang/String;)Ljava/lang/String;
 #26=Class #28 // java/lang/invoke/StringConcatFactory
 #27=NameAndType #24:#32 // makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
 #28=Utf8 java/lang/invoke/StringConcatFactory
 #29=Class #34 // java/lang/invoke/MethodHandles$Lookup
 #30=Utf8 Lookup
 #31=Utf8 InnerClasses
 #32=Utf8 (Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
 #33=Class #35 // java/lang/invoke/MethodHandles
 #34=Utf8 java/lang/invoke/MethodHandles$Lookup
 #35=Utf8 java/lang/invoke/MethodHandles
{
 public com.example.javac.Java9StringDemo();
 descriptor: ()V
 flags: (0x0001) ACC_PUBLIC
 Code:
 stack=1, locals=1, args_size=1
 0: aload_0
 1: invokespecial #1 // Method java/lang/Object."<init>":()V
 4: return
 LineNumberTable:
 line 8: 0
?
 public static void main(java.lang.String[]);
 descriptor: ([Ljava/lang/String;)V
 flags: (0x0009) ACC_PUBLIC, ACC_STATIC
 Code:
 stack=1, locals=3, args_size=1
 0: ldc #2 // String tom
 2: astore_1
 3: aload_1
 4: invokedynamic #3, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
 9: astore_2
 10: return
 LineNumberTable:
 line 11: 0
 line 12: 3
 line 13: 10
}
SourceFile: "Java9StringDemo.java"
InnerClasses:
 public static final #30=#29 of #33; // Lookup=class java/lang/invoke/MethodHandles$Lookup of class java/lang/invoke/MethodHandles
BootstrapMethods:
 0: #17 REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
 Method arguments:
 #18 \u0001cat
  • javap之后可以看到通过Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化;而Java 8则是通过转换为StringBuilder来进行优化

StringConcatFactory.makeConcatWithConstants

java.base/java/lang/invoke/StringConcatFactory.java

public final class StringConcatFactory {
 //......
?
 /**
 * Concatenation strategy to use. See {@link Strategy} for possible options.
 * This option is controllable with -Djava.lang.invoke.stringConcat JDK option.
 */
 private static Strategy STRATEGY;
?
 /**
 * Default strategy to use for concatenation.
 */
 private static final Strategy DEFAULT_STRATEGY=Strategy.MH_INLINE_SIZED_EXACT;
?
 private enum Strategy {
 /**
 * Bytecode generator, calling into {@link java.lang.StringBuilder}.
 */
 BC_SB,
?
 /**
 * Bytecode generator, calling into {@link java.lang.StringBuilder};
 * but trying to estimate the required storage.
 */
 BC_SB_SIZED,
?
 /**
 * Bytecode generator, calling into {@link java.lang.StringBuilder};
 * but computing the required storage exactly.
 */
 BC_SB_SIZED_EXACT,
?
 /**
 * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
 * This strategy also tries to estimate the required storage.
 */
 MH_SB_SIZED,
?
 /**
 * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
 * This strategy also estimate the required storage exactly.
 */
 MH_SB_SIZED_EXACT,
?
 /**
 * MethodHandle-based generator, that constructs its own byte[] array from
 * the arguments. It computes the required storage exactly.
 */
 MH_INLINE_SIZED_EXACT
 }
?
 static {
 // In case we need to double-back onto the StringConcatFactory during this
 // static initialization, make sure we have the reasonable defaults to complete
 // the static initialization properly. After that, actual users would use
 // the proper values we have read from the properties.
 STRATEGY=DEFAULT_STRATEGY;
 // CACHE_ENABLE=false; // implied
 // CACHE=null; // implied
 // DEBUG=false; // implied
 // DUMPER=null; // implied
?
 Properties props=GetPropertyAction.privilegedGetProperties();
 final String strategy=props.getProperty("java.lang.invoke.stringConcat");
 CACHE_ENABLE=Boolean.parseBoolean(
 props.getProperty("java.lang.invoke.stringConcat.cache"));
 DEBUG=Boolean.parseBoolean(
 props.getProperty("java.lang.invoke.stringConcat.debug"));
 final String dumpPath=props.getProperty("java.lang.invoke.stringConcat.dumpClasses");
?
 STRATEGY=(strategy==null) ? DEFAULT_STRATEGY : Strategy.valueOf(strategy);
 CACHE=CACHE_ENABLE ? new ConcurrentHashMap<>() : null;
 DUMPER=(dumpPath==null) ? null : ProxyClassesDumper.getInstance(dumpPath);
 }
?
 public static CallSite makeConcatWithConstants(MethodHandles.Lookup lookup,
 String name,
 MethodType concatType,
 String recipe,
 Object... constants) throws StringConcatException {
 if (DEBUG) {
 System.out.println("StringConcatFactory " + STRATEGY + " is here for " + concatType + ", {" + recipe + "}, " + Arrays.toString(constants));
 }
?
 return doStringConcat(lookup, name, concatType, false, recipe, constants);
 }
?
 private static CallSite doStringConcat(MethodHandles.Lookup lookup,
 String name,
 MethodType concatType,
 boolean generateRecipe,
 String recipe,
 Object... constants) throws StringConcatException {
 Objects.requireNonNull(lookup, "Lookup is null");
 Objects.requireNonNull(name, "Name is null");
 Objects.requireNonNull(concatType, "Concat type is null");
 Objects.requireNonNull(constants, "Constants are null");
?
 for (Object o : constants) {
 Objects.requireNonNull(o, "Cannot accept null constants");
 }
?
 if ((lookup.lookupModes() & MethodHandles.Lookup.PRIVATE)==0) {
 throw new StringConcatException("Invalid caller: " +
 lookup.lookupClass().getName());
 }
?
 int cCount=0;
 int oCount=0;
 if (generateRecipe) {
 // Mock the recipe to reuse the concat generator code
 char[] value=new char[concatType.parameterCount()];
 Arrays.fill(value, TAG_ARG);
 recipe=new String(value);
 oCount=concatType.parameterCount();
 } else {
 Objects.requireNonNull(recipe, "Recipe is null");
?
 for (int i=0; i < recipe.length(); i++) {
 char c=recipe.charAt(i);
 if (c==TAG_CONST) cCount++;
 if (c==TAG_ARG) oCount++;
 }
 }
?
 if (oCount !=concatType.parameterCount()) {
 throw new StringConcatException(
 "Mismatched number of concat arguments: recipe wants " +
 oCount +
 " arguments, but signature provides " +
 concatType.parameterCount());
 }
?
 if (cCount !=constants.length) {
 throw new StringConcatException(
 "Mismatched number of concat constants: recipe wants " +
 cCount +
 " constants, but only " +
 constants.length +
 " are passed");
 }
?
 if (!concatType.returnType().isAssignableFrom(String.class)) {
 throw new StringConcatException(
 "The return type should be compatible with String, but it is " +
 concatType.returnType());
 }
?
 if (concatType.parameterSlotCount() > MAX_INDY_CONCAT_ARG_SLOTS) {
 throw new StringConcatException("Too many concat argument slots: " +
 concatType.parameterSlotCount() +
 ", can only accept " +
 MAX_INDY_CONCAT_ARG_SLOTS);
 }
?
 String className=getClassName(lookup.lookupClass());
 MethodType mt=adaptType(concatType);
 Recipe rec=new Recipe(recipe, constants);
?
 MethodHandle mh;
 if (CACHE_ENABLE) {
 Key key=new Key(className, mt, rec);
 mh=CACHE.get(key);
 if (mh==null) {
 mh=generate(lookup, className, mt, rec);
 CACHE.put(key, mh);
 }
 } else {
 mh=generate(lookup, className, mt, rec);
 }
 return new ConstantCallSite(mh.asType(concatType));
 }
?
 private static MethodHandle generate(Lookup lookup, String className, MethodType mt, Recipe recipe) throws StringConcatException {
 try {
 switch (STRATEGY) {
 case BC_SB:
 return BytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.DEFAULT);
 case BC_SB_SIZED:
 return BytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED);
 case BC_SB_SIZED_EXACT:
 return BytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED_EXACT);
 case MH_SB_SIZED:
 return MethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED);
 case MH_SB_SIZED_EXACT:
 return MethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED_EXACT);
 case MH_INLINE_SIZED_EXACT:
 return MethodHandleInlineCopyStrategy.generate(mt, recipe);
 default:
 throw new StringConcatException("Concatenation strategy " + STRATEGY + " is not implemented");
 }
 } catch (Error | StringConcatException e) {
 // Pass through any error or existing StringConcatException
 throw e;
 } catch (Throwable t) {
 throw new StringConcatException("Generator failed", t);
 }
 }
?
 //......
}
  • makeConcatWithConstants方法内部调用了doStringConcat,而doStringConcat方法则调用了generate方法来生成MethodHandle;generate根据不同的STRATEGY来生成MethodHandle,这些STRATEGY有BC_SB、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT(可以通过-Djava.lang.invoke.stringConcat来改变默认的策略)

小结

  • Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16
  • isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true;诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16
  • Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化,相比于Java 8通过转换为StringBuilder来进行优化,Java 9提供了多种STRATEGY可供选择,这些STRATEGY有BC_SB(等价于Java 8的优化方式)、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT(可以通过-Djava.lang.invoke.stringConcat来改变默认的策略)

doc

  • String Compaction
  • JEP 254: Compact Strings
  • Java 9: Compact Strings
  • Compact Strings In Java 9
  • Java 9 Compact Strings Example
  • Evolution of Strings in Java to Compact Strings and Indify String Concatenation