.NET 向量类型的运算结果范例——用于学习Vector类所提供百多个向量方法

  • .NET 向量类型的运算结果范例——用于学习Vector类所提供百多个向量方法已关闭评论
  • 121 次浏览
  • A+
所属分类:.NET技术
摘要

作者: 从.NET Core 1.0(或 .NET Framework 4.5、.NET Standard 1.0)开始,.NET中便可以使用具有SIMD硬件加速的向量类型了。
其中大小与硬件相关的向量(Vectors with a hardware dependent size)作用最大。它由 只读结构体(readonly struct) Vector<T>,及辅助的静态类 Vector 所组成。
只读结构体 Vector<T> 主要是通过运算符提供了常规算术运算的能力,功能有限。而静态类 Vector 为向量类型提供了大量的运算函数,能大大拓展了向量类型的使用领域。
但是静态类 Vector 提供了大量的方法,数量达到一百多个,且文档说明很简略,导致学习起来很困难。

作者:

目录

    一、背景

    从.NET Core 1.0(或 .NET Framework 4.5、.NET Standard 1.0)开始,.NET中便可以使用具有SIMD硬件加速的向量类型了。
    其中大小与硬件相关的向量(Vectors with a hardware dependent size)作用最大。它由 只读结构体(readonly struct) Vector<T>,及辅助的静态类 Vector 所组成。
    只读结构体 Vector<T> 主要是通过运算符提供了常规算术运算的能力,功能有限。而静态类 Vector 为向量类型提供了大量的运算函数,能大大拓展了向量类型的使用领域。
    但是静态类 Vector 提供了大量的方法,数量达到一百多个,且文档说明很简略,导致学习起来很困难。

    于是我编写了一个Demo程序,将静态类 Vector所提供百多个向量方法,每一个均编写了测试代码。利用 测试代码、运行结果 与官方文档进行对照,这样便更容易弄懂了。

    二、编写Demo程序(VectorClassDemo)

    2.1 项目结构

    目前解决方案里有这3个项目:

    • VectorClassDemo:共享项目。里面是公用的测试代码。
    • VectorClassDemo20:.NET Core 2.0 控制台项目。用于测试低版本 .NET Core 2.0 时的运行情况。
    • VectorClassDemo50:Net 5.0 控制台项目。用于测试高版本 .NET 时的运行情况。例如可临时将项目的目标框架修改为“.Net 7.0”,测试 “.Net 7.0”下的表现。

    为了便于不同目标框架的测试,于是将公用的测试代码放在共享项目里,这样能便于代码复用,使控制台的代码简单。例如 VectorClassDemo50 中 Program.cs 代码为:

    using System; using System.IO; using VectorClassDemo;  namespace VectorClassDemo50 {     class Program {         static void Main(string[] args) {             string indent = "";             TextWriter tw = Console.Out;             tw.WriteLine("VectorClassDemo50");             tw.WriteLine();             VectorDemo.OutputEnvironment(tw, indent);             tw.WriteLine();             VectorDemo.Run(tw, indent);         }     } } 

    2.2 输出环境信息(OutputEnvironment)

    因为这次测试了多个平台,不同平台的环境信息信息均不同。于是可以专门用一个函数来输出环境信息,源码如下。

    /// <summary> /// Is release make. /// </summary> public static readonly bool IsRelease = #if DEBUG     false #else     true #endif ;  /// <summary> /// Output Environment. /// </summary> /// <param name="tw">Output <see cref="TextWriter"/>.</param> /// <param name="indent">The indent.</param> public static void OutputEnvironment(TextWriter tw, string indent) {     if (null == tw) return;     if (null == indent) indent = "";     //string indentNext = indent + "t";     tw.WriteLine(indent + string.Format("IsRelease:t{0}", IsRelease));     tw.WriteLine(indent + string.Format("EnvironmentVariable(PROCESSOR_IDENTIFIER):t{0}", Environment.GetEnvironmentVariable("PROCESSOR_IDENTIFIER")));     tw.WriteLine(indent + string.Format("Environment.ProcessorCount:t{0}", Environment.ProcessorCount));     tw.WriteLine(indent + string.Format("Environment.Is64BitOperatingSystem:t{0}", Environment.Is64BitOperatingSystem));     tw.WriteLine(indent + string.Format("Environment.Is64BitProcess:t{0}", Environment.Is64BitProcess));     tw.WriteLine(indent + string.Format("Environment.OSVersion:t{0}", Environment.OSVersion));     tw.WriteLine(indent + string.Format("Environment.Version:t{0}", Environment.Version));     //tw.WriteLine(indent + string.Format("RuntimeEnvironment.GetSystemVersion:t{0}", System.Runtime.InteropServices.RuntimeEnvironment.GetSystemVersion())); // Same Environment.Version     tw.WriteLine(indent + string.Format("RuntimeEnvironment.GetRuntimeDirectory:t{0}", System.Runtime.InteropServices.RuntimeEnvironment.GetRuntimeDirectory())); #if (NET47 || NET462 || NET461 || NET46 || NET452 || NET451 || NET45 || NET40 || NET35 || NET20) || (NETSTANDARD1_0) #else     tw.WriteLine(indent + string.Format("RuntimeInformation.FrameworkDescription:t{0}", System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription)); #endif     tw.WriteLine(indent + string.Format("BitConverter.IsLittleEndian:t{0}", BitConverter.IsLittleEndian));     tw.WriteLine(indent + string.Format("IntPtr.Size:t{0}", IntPtr.Size));     tw.WriteLine(indent + string.Format("Vector.IsHardwareAccelerated:t{0}", Vector.IsHardwareAccelerated));     tw.WriteLine(indent + string.Format("Vector<byte>.Count:t{0}t# {1}bit", Vector<byte>.Count, Vector<byte>.Count * sizeof(byte) * 8));     //tw.WriteLine(indent + string.Format("Vector<float>.Count:t{0}t# {1}bit", Vector<float>.Count, Vector<float>.Count * sizeof(float) * 8));     //tw.WriteLine(indent + string.Format("Vector<double>.Count:t{0}t# {1}bit", Vector<double>.Count, Vector<double>.Count * sizeof(double) * 8));     Assembly assembly;     //assembly = typeof(Vector4).GetTypeInfo().Assembly;     //tw.WriteLine(string.Format("Vector4.Assembly:t{0}", assembly));     //tw.WriteLine(string.Format("Vector4.Assembly.CodeBase:t{0}", assembly.CodeBase));     assembly = typeof(Vector<float>).GetTypeInfo().Assembly;     tw.WriteLine(string.Format("Vector<T>.Assembly.CodeBase:t{0}", assembly.CodeBase));      OutputIntrinsics(tw, indent); }  /// <summary> /// Output Intrinsics. /// </summary> /// <param name="tw">Output <see cref="TextWriter"/>.</param> /// <param name="indent">The indent.</param> public static void OutputIntrinsics(TextWriter tw, string indent) {     if (null == tw) return;     if (null == indent) indent = ""; #if NETCOREAPP3_0_OR_GREATER     tw.WriteLine();     tw.WriteLine(indent + "[Intrinsics.X86]");     WriteLineFormat(tw, indent, "Aes.IsSupported:t{0}", System.Runtime.Intrinsics.X86.Aes.IsSupported);     WriteLineFormat(tw, indent, "Aes.X64.IsSupported:t{0}", System.Runtime.Intrinsics.X86.Aes.X64.IsSupported);     WriteLineFormat(tw, indent, "Avx.IsSupported:t{0}", Avx.IsSupported);     WriteLineFormat(tw, indent, "Avx.X64.IsSupported:t{0}", Avx.X64.IsSupported);     WriteLineFormat(tw, indent, "Avx2.IsSupported:t{0}", Avx2.IsSupported);     WriteLineFormat(tw, indent, "Avx2.X64.IsSupported:t{0}", Avx2.X64.IsSupported); #if NET6_0_OR_GREATER     WriteLineFormat(tw, indent, "AvxVnni.IsSupported:t{0}", AvxVnni.IsSupported);     WriteLineFormat(tw, indent, "AvxVnni.X64.IsSupported:t{0}", AvxVnni.X64.IsSupported); #endif     WriteLineFormat(tw, indent, "Bmi1.IsSupported:t{0}", Bmi1.IsSupported);     WriteLineFormat(tw, indent, "Bmi1.X64.IsSupported:t{0}", Bmi1.X64.IsSupported);     WriteLineFormat(tw, indent, "Bmi2.IsSupported:t{0}", Bmi2.IsSupported);     WriteLineFormat(tw, indent, "Bmi2.X64.IsSupported:t{0}", Bmi2.X64.IsSupported);     WriteLineFormat(tw, indent, "Fma.IsSupported:t{0}", Fma.IsSupported);     WriteLineFormat(tw, indent, "Fma.X64.IsSupported:t{0}", Fma.X64.IsSupported);     WriteLineFormat(tw, indent, "Lzcnt.IsSupported:t{0}", Lzcnt.IsSupported);     WriteLineFormat(tw, indent, "Lzcnt.X64.IsSupported:t{0}", Lzcnt.X64.IsSupported);     WriteLineFormat(tw, indent, "Pclmulqdq.IsSupported:t{0}", Pclmulqdq.IsSupported);     WriteLineFormat(tw, indent, "Pclmulqdq.X64.IsSupported:t{0}", Pclmulqdq.X64.IsSupported);     WriteLineFormat(tw, indent, "Popcnt.IsSupported:t{0}", Popcnt.IsSupported);     WriteLineFormat(tw, indent, "Popcnt.X64.IsSupported:t{0}", Popcnt.X64.IsSupported);     WriteLineFormat(tw, indent, "Sse.IsSupported:t{0}", Sse.IsSupported);     WriteLineFormat(tw, indent, "Sse.X64.IsSupported:t{0}", Sse.X64.IsSupported);     WriteLineFormat(tw, indent, "Sse2.IsSupported:t{0}", Sse2.IsSupported);     WriteLineFormat(tw, indent, "Sse2.X64.IsSupported:t{0}", Sse2.X64.IsSupported);     WriteLineFormat(tw, indent, "Sse3.IsSupported:t{0}", Sse3.IsSupported);     WriteLineFormat(tw, indent, "Sse3.X64.IsSupported:t{0}", Sse3.X64.IsSupported);     WriteLineFormat(tw, indent, "Sse41.IsSupported:t{0}", Sse41.IsSupported);     WriteLineFormat(tw, indent, "Sse41.X64.IsSupported:t{0}", Sse41.X64.IsSupported);     WriteLineFormat(tw, indent, "Sse42.IsSupported:t{0}", Sse42.IsSupported);     WriteLineFormat(tw, indent, "Sse42.X64.IsSupported:t{0}", Sse42.X64.IsSupported);     WriteLineFormat(tw, indent, "Ssse3.IsSupported:t{0}", Ssse3.IsSupported);     WriteLineFormat(tw, indent, "Ssse3.X64.IsSupported:t{0}", Ssse3.X64.IsSupported); #if NET5_0_OR_GREATER     WriteLineFormat(tw, indent, "X86Base.IsSupported:t{0}", X86Base.IsSupported);     WriteLineFormat(tw, indent, "X86Base.X64.IsSupported:t{0}", X86Base.X64.IsSupported); #endif // NET5_0_OR_GREATER #if NET7_0_OR_GREATER     WriteLineFormat(tw, indent, "X86Serialize.IsSupported:t{0}", X86Serialize.IsSupported);     WriteLineFormat(tw, indent, "X86Serialize.X64.IsSupported:t{0}", X86Serialize.X64.IsSupported); #endif // NET7_0_OR_GREATER #endif // NETCOREAPP3_0_OR_GREATER  #if NET5_0_OR_GREATER     tw.WriteLine();     tw.WriteLine(indent + "[Intrinsics.Arm]");     WriteLineFormat(tw, indent, "AdvSimd.IsSupported:t{0}", AdvSimd.IsSupported);     WriteLineFormat(tw, indent, "AdvSimd.Arm64.IsSupported:t{0}", AdvSimd.Arm64.IsSupported);     WriteLineFormat(tw, indent, "Aes.IsSupported:t{0}", System.Runtime.Intrinsics.Arm.Aes.IsSupported);     WriteLineFormat(tw, indent, "Aes.Arm64.IsSupported:t{0}", System.Runtime.Intrinsics.Arm.Aes.Arm64.IsSupported);     WriteLineFormat(tw, indent, "ArmBase.IsSupported:t{0}", ArmBase.IsSupported);     WriteLineFormat(tw, indent, "ArmBase.Arm64.IsSupported:t{0}", ArmBase.Arm64.IsSupported);     WriteLineFormat(tw, indent, "Crc32.IsSupported:t{0}", Crc32.IsSupported);     WriteLineFormat(tw, indent, "Crc32.Arm64.IsSupported:t{0}", Crc32.Arm64.IsSupported);     WriteLineFormat(tw, indent, "Dp.IsSupported:t{0}", Dp.IsSupported);     WriteLineFormat(tw, indent, "Dp.Arm64.IsSupported:t{0}", Dp.Arm64.IsSupported);     WriteLineFormat(tw, indent, "Rdm.IsSupported:t{0}", Rdm.IsSupported);     WriteLineFormat(tw, indent, "Rdm.Arm64.IsSupported:t{0}", Rdm.Arm64.IsSupported);     WriteLineFormat(tw, indent, "Sha1.IsSupported:t{0}", Sha1.IsSupported);     WriteLineFormat(tw, indent, "Sha1.Arm64.IsSupported:t{0}", Sha1.Arm64.IsSupported);     WriteLineFormat(tw, indent, "Sha256.IsSupported:t{0}", Sha256.IsSupported);     WriteLineFormat(tw, indent, "Sha256.Arm64.IsSupported:t{0}", Sha256.Arm64.IsSupported); #endif // NET5_0_OR_GREATER } 

    因向量类型与内在函数(Intrinsics Functions)紧密相关,于是该函数还输出了各类内在函数的支持信息。
    在开发过程中,发现 .NET 版本升级时也在增加更多的 内在函数(Intrinsics Functions)。例如 Net 5.0 时增加了大量 Arm架构的内在函数,且增加了 X86Base。
    可以利用条件编译,安全使用当前.NET 版本所允许使用的类。

    2.3 创建测试数据(CreateVectorUseRotate)

    使用 Vector<T> 的构造函数,只能创建单个数字重复的值,或是通过数据(或Span)逐一指定数字。前者太死板,后者又太繁琐。因为在不同的处理器上,Vector<T>的长度是不同的。
    目前在支持 Avx2指令集的机器上,Vector<T>是256位的;而其他情况是 128位的。例如 128位的Vector<T>含有4个Single,而256位的Vector<T>含有8个Single,未来Vector<T>很可能会有512位或更高。
    对于测试来说,很多时候我们用一批循环数字就行。例如 128位时用 “a,b,c,d”,而256位时用“a,b,c,d,a,b,c,d”就好。
    于是我建立了一个根据有限数据来循环铺满各个向量元素的函数。而且它是用 params 定义的可变参数,极大地方便了使用。代码如下。

    /// <summary> /// Create Vector&lt;T&gt; use rotate. /// </summary> /// <typeparam name="T">Vector type.</typeparam> /// <param name="list">Source value list.</param> /// <returns>Returns Vector&lt;T&gt;.</returns> static Vector<T> CreateVectorUseRotate<T>(params T[] list) where T : struct {     if (null == list || list.Length <= 0) return Vector<T>.Zero;     T[] arr = new T[Vector<T>.Count];     int idx = 0;     for(int i=0; i< arr.Length; ++i) {         arr[i] = list[idx];         ++idx;         if (idx >= list.Length) idx = 0;     }     Vector <T> rt = new Vector<T>(arr);     return rt; } 

    2.4 开始测试(Run)

    有了CreateVectorUseRotate帮忙构造测试数据后,我们可以很方便的建立测试程序的骨架了。代码如下:

    public static void Run(TextWriter tw, string indent) {     RunType(tw, indent, CreateVectorUseRotate(float.MinValue, float.PositiveInfinity, float.NaN, -1.2f, 0f, 1f, 2f, 4f), new Vector<float>(2.0f));     RunType(tw, indent, CreateVectorUseRotate(double.MinValue, double.PositiveInfinity, -1.2, 0), new Vector<double>(2.0));     RunType(tw, indent, CreateVectorUseRotate<sbyte>(sbyte.MinValue, sbyte.MaxValue, -1, 0, 1, 2, 3, 4), new Vector<sbyte>(2));     RunType(tw, indent, CreateVectorUseRotate<short>(short.MinValue, short.MaxValue, -1, 0, 1, 2, 3, 4, 127, 128), new Vector<short>(2));     RunType(tw, indent, CreateVectorUseRotate<int>(int.MinValue, int.MaxValue, -1, 0, 1, 2, 3, 32768), new Vector<int>(2));     RunType(tw, indent, CreateVectorUseRotate<long>(long.MinValue, long.MaxValue, -1, 0, 1, 2, 3), new Vector<long>(2));     RunType(tw, indent, CreateVectorUseRotate<byte>(byte.MinValue, byte.MaxValue, 0, 1, 2, 3, 4), new Vector<byte>(2));     RunType(tw, indent, CreateVectorUseRotate<ushort>(ushort.MinValue, ushort.MaxValue, 0, 1, 2, 3, 4, 255, 256), new Vector<ushort>(2));     RunType(tw, indent, CreateVectorUseRotate<uint>(uint.MinValue, uint.MaxValue, 0, 1, 2, 3, 65536), new Vector<uint>(2));     RunType(tw, indent, CreateVectorUseRotate<ulong>(ulong.MinValue, ulong.MaxValue, 0, 1, 2, 3), new Vector<ulong>(2)); } 

    2.5 测试指定类型(RunType)

    RunType 是一个泛型函数,能够分别测试每一种数字类型。主要代码如下。

    /// <summary> /// Run type demo. /// </summary> /// <typeparam name="T">Vector type.</typeparam> /// <param name="tw">Output <see cref="TextWriter"/>.</param> /// <param name="indent">The indent.</param> /// <param name="srcT">Source temp value.</param> /// <param name="src2">Source 2.</param> static void RunType<T>(TextWriter tw, string indent, Vector<T> srcT, Vector<T> src2) where T : struct {     Vector<T> src0 = Vector<T>.Zero;     Vector<T> src1 = Vector<T>.One;     Vector<T> srcAllOnes = ~Vector<T>.Zero;     int elementBitSize = (Vector<byte>.Count / Vector<T>.Count) * 8;     tw.WriteLine(indent + string.Format("-- {0}, Vector<{0}>.Count={1} --", typeof(T).Name, Vector<T>.Count));     WriteLineFormat(tw, indent, "srcT:t{0}", srcT);     //WriteLineFormat(tw, indent, "src2:t{0}", src2);     WriteLineFormat(tw, indent, "srcAllOnes:t{0}", srcAllOnes);      // -- Methods --     #region Methods     //Abs<T>(Vector<T>) Returns a new vector whose elements are the absolute values of the given vector's elements.     WriteLineFormat(tw, indent, "Abs(srcT):t{0}", Vector.Abs(srcT));     WriteLineFormat(tw, indent, "Abs(srcAllOnes):t{0}", Vector.Abs(srcAllOnes));      //Add<T>(Vector<T>, Vector<T>) Returns a new vector whose values are the sum of each pair of elements from two given vectors.     WriteLineFormat(tw, indent, "Add(srcT, src1):t{0}", Vector.Add(srcT, src1));     WriteLineFormat(tw, indent, "Add(srcT, src2):t{0}", Vector.Add(srcT, src2));      //AndNot<T>(Vector<T>, Vector<T>) Returns a new vector by performing a bitwise And Not operation on each pair of corresponding elements in two vectors.     WriteLineFormat(tw, indent, "AndNot(srcT, src1):t{0}", Vector.AndNot(srcT, src1));     WriteLineFormat(tw, indent, "AndNot(srcT, src2):t{0}", Vector.AndNot(srcT, src2)); 

    参数列表里有2个测试用的向量值,分别是 srcT、src2。
    方法的头部定义了一些常用的向量值,如:src0(0的值)、src1(1的值)、srcAllOnes(每个位全为1的值)。随后输出 srcT、srcAllOnes 的值,便于口算数据。

    然后便是分别对 静态类Vector 的各个方法进行测试了。

    2.5.1 非泛型的方法

    静态类Vector所提供的大部分方法是泛型方法,它们在RunType这样的泛型方法内使用时是很方便的。
    但静态类Vector的部分方法不是泛型方法,而是通过重载(overload)的方式提供各个类型的方法的。这时用起来麻烦一些,需要用 typeof 写分支代码。代码如下。

    //ConvertToDouble(Vector<Int64>) Converts a Vector<Int64>to aVector<Double>. //ConvertToDouble(Vector<UInt64>) Converts a Vector<UInt64> to aVector<Double>. //ConvertToInt32(Vector<Single>) Converts a Vector<Single> to aVector<Int32>. //ConvertToInt64(Vector<Double>) Converts a Vector<Double> to aVector<Int64>. //ConvertToSingle(Vector<Int32>) Converts a Vector<Int32> to aVector<Single>. //ConvertToSingle(Vector<UInt32>) Converts a Vector<UInt32> to aVector<Single>. //ConvertToUInt32(Vector<Single>) Converts a Vector<Single> to aVector<UInt32>. //ConvertToUInt64(Vector<Double>) Converts a Vector<Double> to aVector<UInt64>. if (typeof(T) == typeof(Double)) {     WriteLineFormat(tw, indent, "ConvertToInt64(srcT):t{0}", Vector.ConvertToInt64(Vector.AsVectorDouble(srcT)));     WriteLineFormat(tw, indent, "ConvertToUInt64(srcT):t{0}", Vector.ConvertToUInt64(Vector.AsVectorDouble(srcT))); } else if (typeof(T) == typeof(Single)) {     WriteLineFormat(tw, indent, "ConvertToInt32(srcT):t{0}", Vector.ConvertToInt32(Vector.AsVectorSingle(srcT)));     WriteLineFormat(tw, indent, "ConvertToUInt32(srcT):t{0}", Vector.ConvertToUInt32(Vector.AsVectorSingle(srcT))); } else if (typeof(T) == typeof(Int32)) {     WriteLineFormat(tw, indent, "ConvertToSingle(srcT):t{0}", Vector.ConvertToSingle(Vector.AsVectorInt32(srcT))); } else if (typeof(T) == typeof(UInt32)) {     WriteLineFormat(tw, indent, "ConvertToSingle(srcT):t{0}", Vector.ConvertToSingle(Vector.AsVectorUInt32(srcT))); } else if (typeof(T) == typeof(Int64)) {     WriteLineFormat(tw, indent, "ConvertToDouble(srcT):t{0}", Vector.ConvertToDouble(Vector.AsVectorInt64(srcT))); } else if (typeof(T) == typeof(UInt64)) {     WriteLineFormat(tw, indent, "ConvertToDouble(srcT):t{0}", Vector.ConvertToDouble(Vector.AsVectorUInt64(srcT))); } 

    2.5.2 控制值的测试

    部分方法具有控制参数,如进行左移位的ShiftLeft。于是最好写一个循环,分别测试不同的控制值。代码如下。

    #if NET7_0_OR_GREATER //ShiftLeft(Vector<Byte>, Int32)  Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<Int16>, Int32) Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<Int32>, Int32) Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<Int64>, Int32) Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<IntPtr>, Int32)    Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<SByte>, Int32) Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<UInt16>, Int32)    Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<UInt32>, Int32) Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<UInt64>, Int32)    Shifts each element of a vector left by the specified amount. //ShiftLeft(Vector<UIntPtr>, Int32) Shifts each element of a vector left by the specified amount. int[] shiftCounts = new int[] { 1, elementBitSize - 1, elementBitSize, elementBitSize + 1, -1 }; foreach (int shiftCount in shiftCounts) {     if (typeof(T) == typeof(Byte)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorByte(srcT), shiftCount));     } else if (typeof(T) == typeof(Int16)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorInt16(srcT), shiftCount));     } else if (typeof(T) == typeof(Int32)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorInt32(srcT), shiftCount));     } else if (typeof(T) == typeof(Int64)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorInt64(srcT), shiftCount));     } else if (typeof(T) == typeof(IntPtr)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorNInt(srcT), shiftCount));     } else if (typeof(T) == typeof(SByte)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorSByte(srcT), shiftCount));     } else if (typeof(T) == typeof(UInt16)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorUInt16(srcT), shiftCount));     } else if (typeof(T) == typeof(UInt32)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorUInt32(srcT), shiftCount));     } else if (typeof(T) == typeof(UInt64)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorUInt64(srcT), shiftCount));     } else if (typeof(T) == typeof(UIntPtr)) {         WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):t{0}", Vector.ShiftLeft(Vector.AsVectorNUInt(srcT), shiftCount));     } } 

    2.5.3 out 参数

    有一些方法通过out 参数返回了多个值,如能使数据变宽的 Widen。于是可利用“if块”来限制不同类型变量的作用域。代码如下。

    //Widen(Vector<Byte>, Vector<UInt16>, Vector<UInt16>) Widens aVector<Byte> into two Vector<UInt16>instances. //Widen(Vector<Int16>, Vector<Int32>, Vector<Int32>) Widens a Vector<Int16> into twoVector<Int32> instances. //Widen(Vector<Int32>, Vector<Int64>, Vector<Int64>) Widens a Vector<Int32> into twoVector<Int64> instances. //Widen(Vector<SByte>, Vector<Int16>, Vector<Int16>) Widens a Vector<SByte> into twoVector<Int16> instances. //Widen(Vector<Single>, Vector<Double>, Vector<Double>) Widens a Vector<Single> into twoVector<Double> instances. //Widen(Vector<UInt16>, Vector<UInt32>, Vector<UInt32>) Widens a Vector<UInt16> into twoVector<UInt32> instances. //Widen(Vector<UInt32>, Vector<UInt64>, Vector<UInt64>) Widens a Vector<UInt32> into twoVector<UInt64> instances. if (typeof(T) == typeof(Single)) {     Vector<Double> low, high;     Vector.Widen(Vector.AsVectorSingle(srcT), out low, out high);     WriteLineFormat(tw, indent, "Widen(srcT).low:t{0}", low);     WriteLineFormat(tw, indent, "Widen(srcT).high:t{0}", high); } else if (typeof(T) == typeof(SByte)) {     Vector<Int16> low, high;     Vector.Widen(Vector.AsVectorSByte(srcT), out low, out high);     WriteLineFormat(tw, indent, "Widen(srcT).low:t{0}", low);     WriteLineFormat(tw, indent, "Widen(srcT).high:t{0}", high); } else if (typeof(T) == typeof(Int16)) {     Vector<Int32> low, high;     Vector.Widen(Vector.AsVectorInt16(srcT), out low, out high);     WriteLineFormat(tw, indent, "Widen(srcT).low:t{0}", low);     WriteLineFormat(tw, indent, "Widen(srcT).high:t{0}", high); } else if (typeof(T) == typeof(Int32)) {     Vector<Int64> low, high;     Vector.Widen(Vector.AsVectorInt32(srcT), out low, out high);     WriteLineFormat(tw, indent, "Widen(srcT).low:t{0}", low);     WriteLineFormat(tw, indent, "Widen(srcT).high:t{0}", high); } else if (typeof(T) == typeof(Byte)) {     Vector<UInt16> low, high;     Vector.Widen(Vector.AsVectorByte(srcT), out low, out high);     WriteLineFormat(tw, indent, "Widen(srcT).low:t{0}", low);     WriteLineFormat(tw, indent, "Widen(srcT).high:t{0}", high); } else if (typeof(T) == typeof(UInt16)) {     Vector<UInt32> low, high;     Vector.Widen(Vector.AsVectorUInt16(srcT), out low, out high);     WriteLineFormat(tw, indent, "Widen(srcT).low:t{0}", low);     WriteLineFormat(tw, indent, "Widen(srcT).high:t{0}", high); } else if (typeof(T) == typeof(UInt32)) {     Vector<UInt64> low, high;     Vector.Widen(Vector.AsVectorUInt32(srcT), out low, out high);     WriteLineFormat(tw, indent, "Widen(srcT).low:t{0}", low);     WriteLineFormat(tw, indent, "Widen(srcT).high:t{0}", high); } 

    2.6 格式化输出(WriteLineFormat)

    虽然只读结构体 Vector<T>支持 ToString,能够输出各个元素的数值。但在很多时候(例如使用 AndNot 的函数进行二进制运算时),我们需要观察它的二进制数据,故需要以十六进制的方式来显示其中的数据,但Vector<T>不支持十六进制格式化(X)。
    于是专门为 Vector<T> 写了一个重载函数,用于输出它的十六进制值。

    /// <summary> /// Get hex string. /// </summary> /// <typeparam name="T">Vector value type.</typeparam> /// <param name="src">Source value.</param> /// <param name="separator">The separator.</param> /// <param name="noFixEndian">No fix endian.</param> /// <returns>Returns hex string.</returns> private static string GetHex<T>(Vector<T> src, string separator, bool noFixEndian) where T : struct {     Vector<byte> list = Vector.AsVectorByte(src);     int unitCount = Vector<T>.Count;     int unitSize = Vector<byte>.Count / unitCount;     bool fixEndian = false;     if (!noFixEndian && BitConverter.IsLittleEndian) fixEndian = true;     StringBuilder sb = new StringBuilder();     if (fixEndian) {         // IsLittleEndian.         for (int i=0; i < unitCount; ++i) {             if ((i > 0)) {                 if (!string.IsNullOrEmpty(separator)) {                     sb.Append(separator);                 }             }             int idx = unitSize * (i+1) - 1;             for(int j = 0; j < unitSize; ++j) {                 byte by = list[idx];                 --idx;                 sb.Append(by.ToString("X2"));             }         }     } else {         for (int i = 0; i < Vector<byte>.Count; ++i) {             byte by = list[i];             if ((i > 0) && (0 == i % unitSize)) {                 if (!string.IsNullOrEmpty(separator)) {                     sb.Append(separator);                 }             }             sb.Append(by.ToString("X2"));         }     }     return sb.ToString(); }  /// <summary> /// WriteLine with format. /// </summary> /// <typeparam name="T">Vector value type.</typeparam> /// <param name="tw">The TextWriter.</param> /// <param name="indent">The indent.</param> /// <param name="format">The format.</param> /// <param name="src">Source value</param> private static void WriteLineFormat<T>(TextWriter tw, string indent, string format, Vector<T> src) where T : struct {     if (null == tw) return;     string line = indent + string.Format(format, src);     string hex = GetHex(src, " ", false);     line += "t# (" + hex +")";     tw.WriteLine(line); } 

    三、运行结果

    由于Vector类提供了大量的向量方法,再乘以10种基元类型,导致本程序的输出信息很长,达到了90多KB。
    为了避免文章过长,于是这里仅摘录了主要的输出信息。

    VectorClassDemo50  IsRelease:	False EnvironmentVariable(PROCESSOR_IDENTIFIER):	Intel64 Family 6 Model 142 Stepping 10, GenuineIntel Environment.ProcessorCount:	8 Environment.Is64BitOperatingSystem:	True Environment.Is64BitProcess:	True Environment.OSVersion:	Microsoft Windows NT 10.0.19044.0 Environment.Version:	7.0.0 RuntimeEnvironment.GetRuntimeDirectory:	C:Program FilesdotnetsharedMicrosoft.NETCore.App7.0.0 RuntimeInformation.FrameworkDescription:	.NET 7.0.0 BitConverter.IsLittleEndian:	True IntPtr.Size:	8 Vector.IsHardwareAccelerated:	True Vector<byte>.Count:	32	# 256bit Vector<T>.Assembly.CodeBase:	file:///C:/Program Files/dotnet/shared/Microsoft.NETCore.App/7.0.0/System.Private.CoreLib.dll  [Intrinsics.X86] Aes.IsSupported:	True Aes.X64.IsSupported:	True Avx.IsSupported:	True Avx.X64.IsSupported:	True Avx2.IsSupported:	True Avx2.X64.IsSupported:	True AvxVnni.IsSupported:	False AvxVnni.X64.IsSupported:	False Bmi1.IsSupported:	True Bmi1.X64.IsSupported:	True Bmi2.IsSupported:	True Bmi2.X64.IsSupported:	True Fma.IsSupported:	True Fma.X64.IsSupported:	True Lzcnt.IsSupported:	True Lzcnt.X64.IsSupported:	True Pclmulqdq.IsSupported:	True Pclmulqdq.X64.IsSupported:	True Popcnt.IsSupported:	True Popcnt.X64.IsSupported:	True Sse.IsSupported:	True Sse.X64.IsSupported:	True Sse2.IsSupported:	True Sse2.X64.IsSupported:	True Sse3.IsSupported:	True Sse3.X64.IsSupported:	True Sse41.IsSupported:	True Sse41.X64.IsSupported:	True Sse42.IsSupported:	True Sse42.X64.IsSupported:	True Ssse3.IsSupported:	True Ssse3.X64.IsSupported:	True X86Base.IsSupported:	True X86Base.X64.IsSupported:	True X86Serialize.IsSupported:	False X86Serialize.X64.IsSupported:	False  [Intrinsics.Arm] AdvSimd.IsSupported:	False AdvSimd.Arm64.IsSupported:	False Aes.IsSupported:	False Aes.Arm64.IsSupported:	False ArmBase.IsSupported:	False ArmBase.Arm64.IsSupported:	False Crc32.IsSupported:	False Crc32.Arm64.IsSupported:	False Dp.IsSupported:	False Dp.Arm64.IsSupported:	False Rdm.IsSupported:	False Rdm.Arm64.IsSupported:	False Sha1.IsSupported:	False Sha1.Arm64.IsSupported:	False Sha256.IsSupported:	False Sha256.Arm64.IsSupported:	False  -- Single, Vector<Single>.Count=8 -- srcT:	<-3.4028235E+38, ∞, NaN, -1.2, 0, 1, 2, 4>	# (FF7FFFFF 7F800000 FFC00000 BF99999A 00000000 3F800000 40000000 40800000) srcAllOnes:	<NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN>	# (FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF) Abs(srcT):	<3.4028235E+38, ∞, NaN, 1.2, 0, 1, 2, 4>	# (7F7FFFFF 7F800000 7FC00000 3F99999A 00000000 3F800000 40000000 40800000) Abs(srcAllOnes):	<NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN>	# (7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF) Add(srcT, src1):	<-3.4028235E+38, ∞, NaN, -0.20000005, 1, 2, 3, 5>	# (FF7FFFFF 7F800000 FFC00000 BE4CCCD0 3F800000 40000000 40400000 40A00000) Add(srcT, src2):	<-3.4028235E+38, ∞, NaN, 0.79999995, 2, 3, 4, 6>	# (FF7FFFFF 7F800000 FFC00000 3F4CCCCC 40000000 40400000 40800000 40C00000) AndNot(srcT, src1):	<-3.9999998, 2, -3, -2.350989E-39, 0, 0, 2, 2>	# (C07FFFFF 40000000 C0400000 8019999A 00000000 00000000 40000000 40000000) AndNot(srcT, src2):	<-0.99999994, 1, -1.5, -1.2, 0, 1, 0, 1.1754944E-38>	# (BF7FFFFF 3F800000 BFC00000 BF99999A 00000000 3F800000 00000000 00800000) BitwiseAnd(srcT, src1):	<0.5, 1, 1, 1, 0, 1, 0, 1.1754944E-38>	# (3F000000 3F800000 3F800000 3F800000 00000000 3F800000 00000000 00800000) BitwiseAnd(srcT, src2):	<2, 2, 2, 0, 0, 0, 2, 2>	# (40000000 40000000 40000000 00000000 00000000 00000000 40000000 40000000) BitwiseOr(srcT, src1):	<NaN, ∞, NaN, -1.2, 1, 1, ∞, ∞>	# (FFFFFFFF 7F800000 FFC00000 BF99999A 3F800000 3F800000 7F800000 7F800000) BitwiseOr(srcT, src2):	<-3.4028235E+38, ∞, NaN, NaN, 2, ∞, 2, 4>	# (FF7FFFFF 7F800000 FFC00000 FF99999A 40000000 7F800000 40000000 40800000) ... Widen(srcT).low:	<-3.4028234663852886E+38, ∞, NaN, -1.2000000476837158>	# (C7EFFFFFE0000000 7FF0000000000000 FFF8000000000000 BFF3333340000000) Widen(srcT).high:	<0, 1, 2, 4>	# (0000000000000000 3FF0000000000000 4000000000000000 4010000000000000) ...  -- Double, Vector<Double>.Count=4 -- srcT:	<-1.7976931348623157E+308, ∞, -1.2, 0>	# (FFEFFFFFFFFFFFFF 7FF0000000000000 BFF3333333333333 0000000000000000) srcAllOnes:	<NaN, NaN, NaN, NaN>	# (FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF) Abs(srcT):	<1.7976931348623157E+308, ∞, 1.2, 0>	# (7FEFFFFFFFFFFFFF 7FF0000000000000 3FF3333333333333 0000000000000000) Abs(srcAllOnes):	<NaN, NaN, NaN, NaN>	# (7FFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF) Add(srcT, src1):	<-1.7976931348623157E+308, ∞, -0.19999999999999996, 1>	# (FFEFFFFFFFFFFFFF 7FF0000000000000 BFC9999999999998 3FF0000000000000) Add(srcT, src2):	<-1.7976931348623157E+308, ∞, 0.8, 2>	# (FFEFFFFFFFFFFFFF 7FF0000000000000 3FE999999999999A 4000000000000000) AndNot(srcT, src1):	<-3.9999999999999996, 2, -4.4501477170144E-309, 0>	# (C00FFFFFFFFFFFFF 4000000000000000 8003333333333333 0000000000000000) AndNot(srcT, src2):	<-0.9999999999999999, 1, -1.2, 0>	# (BFEFFFFFFFFFFFFF 3FF0000000000000 BFF3333333333333 0000000000000000) BitwiseAnd(srcT, src1):	<0.5, 1, 1, 0>	# (3FE0000000000000 3FF0000000000000 3FF0000000000000 0000000000000000) BitwiseAnd(srcT, src2):	<2, 2, 0, 0>	# (4000000000000000 4000000000000000 0000000000000000 0000000000000000) BitwiseOr(srcT, src1):	<NaN, ∞, -1.2, 1>	# (FFFFFFFFFFFFFFFF 7FF0000000000000 BFF3333333333333 3FF0000000000000) BitwiseOr(srcT, src2):	<-1.7976931348623157E+308, ∞, NaN, 2>	# (FFEFFFFFFFFFFFFF 7FF0000000000000 FFF3333333333333 4000000000000000) ...  -- UInt64, Vector<UInt64>.Count=4 -- srcT:	<0, 18446744073709551615, 0, 1>	# (0000000000000000 FFFFFFFFFFFFFFFF 0000000000000000 0000000000000001) srcAllOnes:	<18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615>	# (FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF) Abs(srcT):	<0, 18446744073709551615, 0, 1>	# (0000000000000000 FFFFFFFFFFFFFFFF 0000000000000000 0000000000000001) Abs(srcAllOnes):	<18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615>	# (FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF) Add(srcT, src1):	<1, 0, 1, 2>	# (0000000000000001 0000000000000000 0000000000000001 0000000000000002) Add(srcT, src2):	<2, 1, 2, 3>	# (0000000000000002 0000000000000001 0000000000000002 0000000000000003) AndNot(srcT, src1):	<0, 18446744073709551614, 0, 0>	# (0000000000000000 FFFFFFFFFFFFFFFE 0000000000000000 0000000000000000) AndNot(srcT, src2):	<0, 18446744073709551613, 0, 1>	# (0000000000000000 FFFFFFFFFFFFFFFD 0000000000000000 0000000000000001) BitwiseAnd(srcT, src1):	<0, 1, 0, 1>	# (0000000000000000 0000000000000001 0000000000000000 0000000000000001) BitwiseAnd(srcT, src2):	<0, 2, 0, 0>	# (0000000000000000 0000000000000002 0000000000000000 0000000000000000) BitwiseOr(srcT, src1):	<1, 18446744073709551615, 1, 1>	# (0000000000000001 FFFFFFFFFFFFFFFF 0000000000000001 0000000000000001) BitwiseOr(srcT, src2):	<2, 18446744073709551615, 2, 3>	# (0000000000000002 FFFFFFFFFFFFFFFF 0000000000000002 0000000000000003) ... ShiftLeft(srcT, 1):	<0, 18446744073709551614, 0, 2>	# (0000000000000000 FFFFFFFFFFFFFFFE 0000000000000000 0000000000000002) ShiftLeft(srcT, 63):	<0, 9223372036854775808, 0, 9223372036854775808>	# (0000000000000000 8000000000000000 0000000000000000 8000000000000000) ShiftLeft(srcT, 64):	<0, 18446744073709551615, 0, 1>	# (0000000000000000 FFFFFFFFFFFFFFFF 0000000000000000 0000000000000001) ShiftLeft(srcT, 65):	<0, 18446744073709551614, 0, 2>	# (0000000000000000 FFFFFFFFFFFFFFFE 0000000000000000 0000000000000002) ShiftLeft(srcT, -1):	<0, 9223372036854775808, 0, 9223372036854775808>	# (0000000000000000 8000000000000000 0000000000000000 8000000000000000) 

    完整的测试结果,请运行程序进行查看。
    源码地址——
    https://github.com/zyl910/BenchmarkVector/tree/main/VectorClassDemo

    参考文献