strange problem with loosing connections

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
mvn
Posts: 1
Joined: Fri May 15, 2009 10:30 am

Fri May 15, 2009 11:16 am

We are having strange problems and the log file is reaaly growing fast with the same line all the time:

i think this is good
5/15 10:12:17.490 1764 PR: w3k-isa2-iscsi-w2k8-fs1 RESERVE_IN: READ_KEYS.
5/15 10:12:17.490 1788 PR: w3k-sql1-iscsi-w2k8-fs1 RESERVE_IN: READ_KEYS.
5/15 10:12:17.490 1704 PR: w2k8-tsgateway-iscsi-w2k8-fs1 RESERVE_IN: READ_KEYS.
5/15 10:12:18.457 1050 PR: w3k-isa1-iscsi-w2k8-fs1 RESERVE_IN: READ_KEYS.
5/15 10:12:18.457 824 PR: w2k3-sps1-iscsi-w2k8-fs1 RESERVE_IN: READ_KEYS.
5/15 10:12:19.456 10c0 PR: w3k-ts3-iscsi-w2k8-fs1 RESERVE_IN: READ_KEYS.

then this shows up
5/15 10:12:26.694 1760 T[2e,dd28]: Management command: LUN reset.
5/15 10:12:26.663 17fc T[3a,de81]: Management command: LUN reset.
5/15 10:12:26.679 1724 T[29,dd51]: Management command: LUN reset.
5/15 10:12:35.040 1724 T[29,dd50] CmdSn 56655, ITT 0xdd4f is not completed.
5/15 10:12:35.040 1724 T[29,dd50]: state 0x600, immediate 0, ITT 0xdd4f, CmdSn 56655, TTT 0x1baa0
5/15 10:12:35.040 1724 T[29,dd50]: read/write 10, read length 136, read done 32, write length 0, write done 0, DATA-IN PDUs 0
5/15 10:12:35.040 1724 T[29,dd50]: DataSN 0, R2TSN 0, status 0, status class 0, status detail 0, response 0, counter 0, authStage 0
5/15 10:12:26.803 17b4 T[35,de8b]: Management command: LUN reset.
5/15 10:12:35.040 17b4 T[35,de8a] CmdSn 56969, ITT 0xde89 is not completed.
5/15 10:12:35.040 17b4 T[35,de8a]: state 0x600, immediate 0, ITT 0xde89, CmdSn 56969, TTT 0x1bd14
5/15 10:12:35.040 17b4 T[35,de8a]: read/write 10, read length 136, read done 24, write length 0, write done 0, DATA-IN PDUs 0
5/15 10:12:35.040 17b4 T[35,de8a]: DataSN 0, R2TSN 0, status 0, status class 0, status detail 0, response 0, counter 0, authStage 0
5/15 10:12:26.679 1784 T[31,ad173]: Management command: LUN reset.
5/15 10:12:35.040 1784 T[31,ad171] CmdSn 708976, ITT 0xad170 is not completed.

after a list of the above lines we get:
5/15 10:12:35.321 1778 C[30], LIN: *** 'recv' thread: recv failed 10058.
5/15 10:12:35.321 17b4 C[35], LIN: recvData returned 10054
5/15 10:12:35.321 17b4 C[35], LIN: *** 'recv' thread: recv failed 10058.
5/15 10:12:35.321 1724 C[29], LIN: recvData returned 10054
5/15 10:12:35.321 1724 C[29], LIN: *** 'recv' thread: recv failed 10058.
5/15 10:12:35.321 388 C[1a], LIN: recvData returned 10054
5/15 10:12:35.321 388 C[1a], LIN: *** 'recv' thread: recv failed 10058.
5/15 10:12:35.321 10f8 C[c], LIN: recvData returned 10054
5/15 10:12:35.321 10f8 C[c], LIN: *** 'recv' thread: recv failed 10058.
5/15 10:12:35.399 778 Srv: Accepted iSCSI connection from 10.31.10.30:58715. (Id = 0x40)
5/15 10:12:35.461 778 C[40], FREE: Event - CONNECTED.
5/15 10:12:35.461 778 C[40], XPT_UP: T3.
5/15 10:12:35.851 778 Srv: Accepted iSCSI connection from 10.31.10.30:58716. (Id = 0x41)
5/15 10:12:35.851 778 C[41], FREE: Event - CONNECTED.
5/15 10:12:35.851 778 C[41], XPT_UP: T3.
5/15 10:12:35.929 778 Srv: Accepted iSCSI connection from 10.31.10.30:58717. (Id = 0x42)
5/15 10:12:35.929 778 C[42], FREE: Event - CONNECTED.
5/15 10:12:35.929 778 C[42], XPT_UP: T3.

then the reconnecting starts:
5/15 10:12:36.288 103c C[41], XPT_UP: Login request: ISID 0x400001370001, TSIH 0x0000.
5/15 10:12:36.288 1624 C[51], XPT_UP: Login request: ISID 0x400001370001, TSIH 0x0000.
5/15 10:12:36.288 f90 C[4e], XPT_UP: Login request: ISID 0x400001370002, TSIH 0x0000.
5/15 10:12:36.288 1830 C[4c], XPT_UP: Login request: ISID 0x400001370001, TSIH 0x0000.
5/15 10:12:36.288 1a08 C[48], XPT_UP: Login request: ISID 0x400001370002, TSIH 0x0000.
5/15 10:12:36.288 694 C[46], XPT_UP: Login request: ISID 0x400001370004, TSIH 0x0000.
5/15 10:12:36.475 1128 C[40], XPT_UP: Event - LOGIN.
5/15 10:12:36.475 1128 C[40], IN_LOGIN: T4.
5/15 10:12:36.475 103c C[41], XPT_UP: Event - LOGIN.
5/15 10:12:36.475 103c C[41], IN_LOGIN: T4.
5/15 10:12:36.475 103c Params: <<< String param 'InitiatorName': received 'iqn.1991-05.com.microsoft:w2k8-dc2.gtsgral.local', accepted 'iqn.1991-05.com.microsoft:w2k8-dc2.gtsgral.local'
5/15 10:12:36.475 103c Params: <<< Enum param 'SessionType': received 'Normal', accepted 'Normal'
5/15 10:12:36.475 15a0 C[42], XPT_UP: Event - LOGIN.
5/15 10:12:36.475 15a0 C[42], IN_LOGIN: T4.
5/15 10:12:36.475 103c Params: <<< String param 'TargetName': received 'w2k8-ts1-iscsi-w2k8-fs1', accepted 'w2k8-ts1-iscsi-w2k8-fs1'
5/15 10:12:36.475 103c Params: <<< Enum param 'AuthMethod': received 'None', accepted 'None'
5/15 10:12:36.475 15a0 Params: <<< String param 'InitiatorName': received 'iqn.1991-05.com.microsoft:w2k8-dc2.gtsgral.local', accepted 'iqn.1991-05.com.microsoft:w2k8-dc2.gtsgral.local'
5/15 10:12:36.475 15a0 Params: <<< Enum param 'SessionType': received 'Normal', accepted 'Normal'
5/15 10:12:36.475 15a0 Params: <<< String param 'TargetName': received 'w2k8-tsgateway-iscsi-w2k8-fs1', accepted 'w2k8-tsgateway-iscsi-w2k8-fs1'
5/15 10:12:36.475 15a0 Params: <<< Enum param 'AuthMethod': received 'None', accepted 'None'
5/15 10:12:36.475 1128 Params: <<< String param 'InitiatorName': received 'iqn.1991-05.com.microsoft:w2k8-dc2.gtsgral.local', accepted 'iqn.1991-05.com.microsoft:w2k8-dc2.gtsgral.local'
5/15 10:12:36.475 1128 Params: <<< Enum param 'SessionType': received 'Normal', accepted 'Normal'
5/15 10:12:36.475 1128 Params: <<< String param 'TargetName': received 'w3k-isa2-iscsi-w2k8-fs1', accepted 'w3k-isa2-iscsi-w2k8-fs1'
5/15 10:12:36.475 984 C[45], XPT_UP: Event - LOGIN.
5/15 10:12:36.475 1128 Params: <<< Enum param 'AuthMethod': received 'None', accepted 'None'
5/15 10:12:36.475 984 C[45], IN_LOGIN: T4.

then there is a large block with connections allowed and numeric param or boolean param
finally this keeps repeating:
5/15 10:12:37.661 1b74 PR: S[4b] LUN 0 is PR reserved with key 000063387466734D
5/15 10:12:37.661 1b74 PR: Per: S[4b] ScsiOp 0x2A - Reservation conflict.
5/15 10:12:37.661 1278 PR: S[46] LUN 0 is PR reserved with key 000063387466734D
5/15 10:12:37.661 1278 PR: Per: S[46] ScsiOp 0x2A - Reservation conflict.
5/15 10:12:37.661 1b74 PR: S[4b] LUN 0 is PR reserved with key 000063387466734D
5/15 10:12:37.661 1b74 PR: Per: S[4b] ScsiOp 0x2A - Reservation conflict.
5/15 10:12:37.661 1278 PR: S[46] LUN 0 is PR reserved with key 000063387466734D
5/15 10:12:37.661 1278 PR: Per: S[46] ScsiOp 0x2A - Reservation conflict.

It starts that we loose connection to the iscsi target and out virtual machines are not able to write to the harddisk when we check everything is working again but most services need reboots to get working good again. Does anyone have a idea what is going on?
Robert (staff)
Posts: 303
Joined: Fri Feb 13, 2009 9:42 am

Fri May 15, 2009 12:43 pm

Hello,

Can you send us the entire log file to support@starwindsoftware.com

Also, can you provide a detailed description of your configuration, such as OSes on the Target and on Initiator, Initiators used, types of virtual devices that are created on a target and the hypervisor type?

Thank you.
Rob
Robert
StarWind Software Inc.
http://www.starwindsoftware.com
andrewwhite
Posts: 1
Joined: Mon Mar 15, 2010 3:07 pm

Mon Mar 15, 2010 3:08 pm

This same issue occurs with me, was a solution ever found for this problem?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Mar 15, 2010 3:48 pm

Task abort issue. If you run the most recent build with the cache enabled it could happen under heavy load. Could you please send you logs to support@starwindsoftware.com so we could check this out. Thanks!
andrewwhite wrote:This same issue occurs with me, was a solution ever found for this problem?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
rgransbury
Posts: 1
Joined: Mon Apr 19, 2010 5:10 pm

Mon Apr 19, 2010 5:12 pm

I just experianced the same issue using v5.3.0 (Build 20100402).
Is this issue still open?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Apr 19, 2010 6:12 pm

No it's closed. If you had applied VMware patches, using latest StarWind build and error-free network drivers.
rgransbury wrote:I just experianced the same issue using v5.3.0 (Build 20100402).
Is this issue still open?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
jeremym87
Posts: 2
Joined: Mon Sep 20, 2010 1:56 pm

Mon Sep 20, 2010 1:59 pm

I'm having this issue with Starwind version 3.5.5, using Hyper-V 2008 R2 with failover clustering. Log files are getting huge, and the service quits. Log files are 9+ GB.

what do I need to do?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Sep 20, 2010 3:59 pm

Start with upgrading to the V5.4 (production version for now) and also check V5.5 beta we'd publish Gold in a week or two. Issue you're facing was supposed to go away with 5.4 and later official builds.
jeremym87 wrote:I'm having this issue with Starwind version 3.5.5, using Hyper-V 2008 R2 with failover clustering. Log files are getting huge, and the service quits. Log files are 9+ GB.

what do I need to do?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
jeremym87
Posts: 2
Joined: Mon Sep 20, 2010 1:56 pm

Mon Sep 20, 2010 11:16 pm

Great thanks!
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Sep 21, 2010 5:43 pm

You're welcomed! Please keep us updated about your upgrade etc process. Thanks!
jeremym87 wrote:Great thanks!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
grangej
Posts: 1
Joined: Thu Nov 25, 2010 7:22 am

Thu Nov 25, 2010 7:26 am

I am also seeing this with the 5.4 release, and I have the latest patch level on my vSphere 4.1 hosts. I have also been speaking with your support team, but I wanted to chime in and say that this is most deferentially not fixed using 5.4 / Latest VMware patches. I noticed some talk about caching, would it be recommended to turn off caching until this is fixed?
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu Nov 25, 2010 8:49 am

1) Upgrade to V5.5 (newborn) and make sure you've installed all latest VMware patches.

2) Caching has nothing to do / brake with this issue.
grangej wrote:I am also seeing this with the 5.4 release, and I have the latest patch level on my vSphere 4.1 hosts. I have also been speaking with your support team, but I wanted to chime in and say that this is most deferentially not fixed using 5.4 / Latest VMware patches. I noticed some talk about caching, would it be recommended to turn off caching until this is fixed?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply