I had to clean the topic a bit as is turned more emotional then professional

Now back to our lambs (c) ...
1) Use Intel I/O Meter. There are tons of a good (and bad) benchmarks and we physically cannot afford to spend time learning all of them.
2) Configure proper test bed. Remove out of brackets as much as you can. Say for initiator use Windows Server 2012 machine (bare metal not VM) and
for target use the same hardware just replacing boot disks with Windows (for StarWind and MS target) and FreeBSD (FreeNAS actually). We're looking for
latency so one spindle will be fine for tests. Running tests in a loopback will give you number A (raw disk performance) and running the same pattern
test over the GbE will tell us how much latency every target (StarWind, MS target and FreeNAS) bring to the table with the numbe B.
3) I do understand you're interested in ESXi performance for now but to know is this a ESXi or StarWind or handshake or whatever issue - start with a
Windows-to-Windows or Windows-to-<others> tests. If everything will go flawlessly we'll find out what's wrong with ESXi config (if any) or ESXi drivers etc.
4) Document everything. Hardware, software configs and interconnection diagrams. Make sure you do run a "short stroke" (using first say 500GB of the disk)
and use the same space for all the tests. And write some random (no all zeros!) pattern on it as SSDs and ZFSs will give crazy numbers when doing I/O
with non-allocated data.
5) When running turn cache OFF. MS target has no one, StarWind should have cache disabled and the same should be configured for ZFS - no dedupe,
no WB cache, no L2ARC. Non-cached numbers will give us LOW watermark of performance, WORST case. Something we're looking for. Cache will improve
but with tests it will just mess everything up.
6) Run proper test pattern. 4-8 workers (simulating VMs) with 8-16 I/Os queue at least.
4KB blocks (native block for modern hard disk) 100% read
4KB blocks 100% write
64KB blocks 100% read
64KB blocks 100% write
the same but sequentially. Should be 8 charts for every target, 3 targets will give us 24 pictures. And 32 adding raw disk in a loopback (Windows should be fine).
Make sure you run the test for a long time (10+ minutes) to have everything stabilized. I'd even run 30 mins test if you care and have time.
After having Part I and finding out it's OK we'll continue with Part II (ESXi initiator). Then can dive with a stripped down set of tests with tests within a VMs.
Going to this scenario directly will tell us nothing as there are too many things where everything can break (ESXi hanshake, ZFS cache etc).
7) We'd be happy to help you with remote session (if required) and also we'd love to see you blogging final results for StarWind Vs. MS target Vs. FreeNAS
on the same hardware. Just *PLEASE* don't post pre-mature ones (the ones we did not confirm as we can do nothing here and giving up @ this moment).
People will love what you do.
Thank you!